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GENERATING SEMANTIC DESCRIPTIONS 
FROM DRAWINGS OF SCENES WITH SHADOWS* 

Abstract 
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SECTION 1.0 6 
1.0 INTRODUCTION 

How do we ascertain the shapes of unfamiliar objects? 
Why do we so seldom confuse shadows with real things? How do 
we "factor out" shadows when looking at scenes? How are we 
able to see the world as essentially the same whether It Is a 
bright sunny day, an overcast day, or a night with only 
streetlights for Illumination? In the terms of this paper, 
how can we recognize the identity of figures 1.1 and 1.2? Do 
we use learning and knowledge to Interpret what we see, or do 
we somehow automatically see the world as stable and 
independent of lighting? What portions of scenes can we 
understand from local features alone, and what configurations 
require the use of global hypotheses? 

Various theories have been proposed to explain how 
people extract three-dimensional information from scenes 
(Gibson 1950 is an excellent reference). It Is well known 
that we get depth and distance Information from motion 
parallax and, for objects fairly close to us, from eye focus 
feedback and parallax. But this does not explain how we are 
able to understand the three-dimensional nature of 
photographed scenes. Perhaps we acquire knowledge of the 
shapes of objects by handling them and moving around them, 
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SECTION 1.0 9 

and use rote memory to assign shape to those objects when we 
recognize them in scenes. But this does not explain how we 
can perceive the shapes of objects we have never seen before. 
Similarly, the fact that we can tell the shapes of many 
objects from as simple a representation as a line drawing 
shows that we do not need texture or other fine details to 
ascertain shape, though we may of course use texture 
gradients and other details to define certain edges. 

I undertook this research with the belief that it is 
possible to discover rules with which a program can obtain a 
three-dimensional model of a scene, given only a reasonably 
good line drawing of a scene. Such a program might have 
applications both in practical situations and in developing 
better theories of human vision. I ntrospect i vely, I do not 
feel that there is a great difference between seeing 
"reality" and seeing line drawings. 

Moreover, there are considerable difficulties both In 
processing stereo images (such as the problem of deciding 
which points on each retina correspond to the same scene 
point; see Guzman 1968, Lerman 1970) and In building a system 
Incorporating hand-eye coordination which could be used to 
help explore and disambiguate a scene (Gaschnig 1971). It 
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seems to me that while the use of range finders, multiple 
light sources to help eliminate shadows (Shirai 1971), and 
the restriction of scenes to known objects may all prove 
useful for practical robots, these approaches avoid ©mlng to 
grips with the nature of human perception vls-a-vls the 
implicit three-dimensional information in line drawings of 
real scenes. While I would be very cautious about claiming 
parallels between the rules In my program and human visual 
processes, at the very least I have demonstrated a number of 
capable vision programs which require only fixed, monocular 
line drawings for their operation. 

In this thesis I describe a working collection of 
computer programs which reconstruct three-dimensional 
descriptions from line drawings which are obtained from 
scenes composed of plane-faced objects under various lighting 
conditions. In this description the system Identifies shadow 
lines and regions, groups regions which belong to the same 
object, and notices such relations as contact or lack of 
contact between the objects, support and In-f ront-of /behind 
relations between the objects as well as Information about 
the spaclal orientation of various regions, all using the 
description it has generated. 
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1.1 DESCRIPTIONS 

The overall goal of the system isto provide a precise 
description of a plausible scene which could give rise to a 
particular line drawing. It is therefore Important to have a 
good language in which to describe features of scenes. Since 
I wish to have the program operate on unfamiliar objects, the 
language must be capable of describing such objects. The 
language I have used is an expansion of the labeling system 
developed by Huffman (Huffman 1971) in the United States and 
Clowes (Clowes 1971) in Great Britain. 

The language employs labels which are assigned to line 
segments and regions In the scene. These labels describe the 
edge geometry, the connection or lack of connection between 
adjacent regions, the orientation of each region In three 
dimensions, and the nature of the Illumination for each 
region (illuminated, projected shadow region, or region 
facing away from the light source). The goal of the program 
is to assign a single label value to each line and region In 
the line drawing, except In cases where humans also find a 
feature to be ambiguous. 
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This language allows precise definitions of such 
concepts as supported by, In front of, behind, rests against, 
shadows, is shadowed by, Is capable of supporting, leans on, 
and others. Thus, if it Is possible to label each feature of 
a scene uniquely, then It Is possible to directly extract 
these relations from the description of the scene based on 
this label ing. 

1.2 JUNCTION LABELS 

Much of the program's power is based on access to lists 
of possible line label assignments for each type of junction 
in a line drawing. While a natural language analogy to these 
labels could be misleading, I think that It helps in 
explaining the basic operation of this portion of the 
program. 



If we think of each possible label for a line as a 
letter in the alphabet, then each junction must be 
labeled with an ordered list of "letters" to form a 
legal "word" In the language. Thus each "word" 
represents a physically possible Interpretation for a 
given junction. Furthermore, each "word" must match the 
words" for surrounding junctions In order to form a 
legal "phrase", and all "phrases" In the scene must 
agree to form a legal "sentence" for the entire scene. 
The knowledge of the system Is contained In (1) a 
dictionary made up of every legal "word" for «ch type 
of junction, and (2) rules by which "words" can legally 
combine with other "words". The range of the dictionary 



SECTION 1.2 13 

entries defines the universe of the program; this 
universe can be expanded by adding new entries 
systematically to the dictionary. 

In fact/ the "dictionary" need not be a stored list. 
The dictionary can consist of a relatively small list of 
possible edge geometries for each junction type, and a set of 
rules which can be used to generate the complete dictionary 
from the original lists. Depending on the amount of computer 
memory available, It may either be desirable to store the 
complete lists as compiled knowledge or to generate the lists 
when they are needed. In my current program the lists are 
for the most part precompiled. 

The composition of the dictionary I slnterest i ng In Its 
own right. While some basic edge geometries give rise to 
many dictionary entries, some give rise to very few. The 
total number of entries sharing the same edge geometry can be 
as low as three for some ARROW junctions, including shadow 
edges, while the number generated by some FORK junction edge 
geometries Is over 270,000 (including region orientation and 
I lluminatlon values). 
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1.3 JUNCTION LABEL ASSIGNMENT 

There Is a considerable amount of local Information 
which can be used to select a subset of the total number of 
dictionary entries which are consistent with a particular 
junction. The first piece of Information Isal ready Included 
implicitly In the Idea of junction type. Junctions are typed 
according to the number of lines which make up the junction 
and the two dimensional arrangement of these lines. Figure 
1.3 shows all the junction types which can occur In the 
universe of the program. The dictionary Is arranged by 
junction type/ and a standard ordering Is assigned to all the 
line segments which make up junctions (except FORKS and 
MULTIS). 

The program can also use local region brightness and 
line segment direction to preclude the assignment of certain 
labels to lines. For example/ If it knows that one region is 
brighter than an adjacent region/ then the line which 
separates the regions can be labeled as a shadowreglon In 
only one way. There are other rules which relate region 
orientation, light placement and region Illumination as well 
as rules which limit the number of labels which can be 
assigned to line segments which border the support surface 
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SECTION 1.3 16 

for the scene. The program Is able to combine all these 
types of Information In finding a list of appropriate labels 
for a single junction. 

l.k COMBINATION RULES 

Combination rules are used to select from the Initial 
assignments the label/ or labels/ which correctly describe 
the scene features that could have produced each junction In 
the given line drawing. The simplest type of combination 
rule merely states that a label is a possible description for 
a junction if and only If there Is at least one label which 
"matches" It assigned to each adjacent junction. Two 
junction labels "match" If and only If the line segment which 
joins the junctions gets the same Interpretation from both of 
the junctions at Its ends. 

Of course/ each Interpretation (line label) Is really a 
shorthand code for a number of properties of the line and Its 
adjoining regions. If the program can show that any one of 
these constituent values cannot occur In the given scene 
context/ then the whole complex of values for that line 
expressed Implicitly In the Interpretation cannot be possible 
either and/ furthermore/ any junction label which assigns 
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this interpretation to the line segment can be eliminated as 
well. Thus, when it chooses a label to describe a 
particular junction/ It constrains all the junctions which 
surround the regions touching this junction, even though the 
combination rules only compare adjacent junctions. 

More complicated rules are needed If it Is necessary to 
relate junctions which do not share a visible region or line 
segment. For example, I thought at the outset of my work 
that it might be necessary to construct models of hidden 
vertices or features which faced away from the eye In order 
to find unique labels for the visible features. The 
difficulty In this Is that unless a program can find which 
lines represent obscuring edges, It cannot know where to 
construct hidden features, but If It needs the hidden 
features to label the lines, I tmay not be able to decide 
which lines represent obscuring edges. As It turns out, no 
such complicated rules and constructions are necessary In 
general; most of the labeling problem can be solved by a 
scheme which only compares adjacent junctions. 



SECTION 1.5 18 
1.5 EXPERIMENTAL RESULTS 

When I began to write a program to Implement the system 
I had devised, I expected to use a tree search system to find 
which labels or "words" could be assigned to each junction. 
However, the number of dictionary entries for each type of 
junction is very high, (there are almost 3000 different ways 
to label a FORK junction before even considering the possible 
region orientations!) so I decided to use a sort of 
"filtering program" before doing a full tree search. 

The program computes the full list of dictionary entries 
for each junction In the scene, eliminates from the list 
those labels which can be precluded on the basis of local 
features, assigns each reduced list to Its junction, and then 
the filtering program computes the possible labels for each 
line, using the fact that a line label Is possible If and 
only If there Is at least one junction label at each end of 
the line which contains the line label. Thus, the list of 
possible labels for a line segment Is the Intersection of the 
two lists of possibilities computed from the junction labels 
at the ends of the line segment. If any junction label would 
assign a Interpretation to the line segment which Is not In 
this Intersection list, then that label can be eliminated 
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from consideration. The filtering program uses a network 
Iteration scheme to systematically remove all the 
interpretations which are precluded by the elimination of 
labels at a particular junction. 

When I ran this filtering program I was amazed to find 
that In the first few scenes I tried, this program found a 
unique label for each line. Even when I tried considerably 
more complicated scenes, there were only a few lines In 
general which were not uniquely specified, and some of these 
were essentially ambiguous, i.e. I could not decide exactly 
what sort of edge gave rise to the line segment myself. The 
other ambiguities^ I.e. the ones which I could resolve 
myself, In general require that the program recognize lines 
which are parallel or colllnear or regions which meet along 
more than one line segment, and hence require more global 
agreement. 

I have been able to use this system to Investigate a 
large number of line drawings/ Including ones with missing 
lines and ones with numerous accidentally aligned junctions. 
From these Investigations I can say with some certainty which 
types of scene features can be handled by the filtering 
program and which require more complicated processing. 
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Whether or not more processing is required, the filtering 
system provides a computationally cheap method for acquiring 
a great deal of information. For example, In most scenes a 
large percentage of the line segments are unambiguously 
labeled / and more complicated processing can be directed to 
the areas which remain ambiguous. As another example, if I 
only wish to know which lines are shadows or which lines are 
the outside edges of objects or how many objects there are In 
the scene, the program may be able to get this information 
even though some ambiguities remain, since the ambiguity may 
only involve region illumination type or region orientation. 

Figure l.k shows some of the scenes which the program Is 
able to handle. The segments which remain ambiguous after 
Its operation are marked with stars, and the approximate 
amount of time the program requires to label each scene is 
marked below it. The computer is a PDP-10, and the program 
is written partially In MICRO-PLANNER (Sussman et al 1971) 
and partially In compiled LISP. 
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SECTION 1.6 23 
1.6 COMPARISON WITH OTHER VISION PROGRAMS 

My system differs from previously proposed ones In 
several important ways: 

First, it is able to handle a much broader range of 
scene types than have previous programs. The program 
"understands" shadows, some junctions which have missing 
lines, and apparent alignment of edges caused by the 
particular placement of the eye with respect to the scene, so 
that no special effort needs to be made to avoid problematic 
features. 

Second, the design of the program facilitates Its 
integration with line-finding programs and higher-level 
programs such as programs which deal with natural language or 
overall system goals. The system can be used to write a 
program which automatically requests and uses many different 
types of information to find the possible Interpretations for 
a single feature or portion of a scene. 

Third, the program is able to deal with ambiguity In a 
natural manner. Some features In a scene can be ambiguous to 
a person looking at the same scene and the program preserves 
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these various possibl i ties. This tolerance for ambiguity is 
central to the philosophy of the program; rather than trying 
to pick the "most probable" Interpretation of any features / 
the program operates by trying to eliminate Impossible 
interpretations. If It has been given Insufficient 
information to decide on a unique possibility/ then It 
preserves all the active possibilities it knows. Of course 
if a single interpretation is required for some reason/ one 
can be chosen from this list by heuristic rules. 

Fourth/ the program is algorithmic and does not require 
facilities for back-up if the filter program finds an 
adequate description. Heuristics have been used In all 
previous vision programs to approximate reality by the most 
likely interpretation. This may simplify some problems/ but 
sophisticated programs are needed to patch up the cases where 
the approximation Is wrong; In my program I have used as 
complete a description as I could devise with the result that 
the programs are particularly simple/ transparent and 
powerful . 

Fifth/ because of this simplicity/ I have been able to 
write a program which operates very rapidly. As a practical 
matter this is very useful for debugging the system, and 



SECTION 1.6 25 

allows modifications to be made with relative ease. 
Moreover, because of Its speed, I have been able to test the 
program on many separate line drawings and have thus been 
able to gain a clearer understanding of the capabilities and 
ultimate limitations of the program. In turn, this 
understanding has led and should continue to lead to useful 
modifications and a greater understanding of the nature and 
complexity of procedures necessary to handle various types of 
scene features. 

Sixth, as explained in the next section, the descriptive 
language provides a theoretical foundation of considerable 
value In explaining previous work. 

1.7 HISTORICAL PERSPECTIVE 

One of the great values of the extensive descriptive 
apparatus I have developed Is Its ability to explain the 
nature and shortcomings of past work. I will discuss in 
Chapter 9 how my system helps In understanding the work of 
Guzman (Guzman 1968), Rattner (Rattner 1970), Huffman 
(Huffman 1971), Clowes (Clowes 1971), and Orban (Orban 1970); 
and to explain portions of the work of Winston (Winston 1970) 
and Flnln (Flnln 1971a, 1971b). For example, I show how 
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various concepts such as support can be formalized In my 
descriptive language. From this historical comparison 
emerges a striking demonstration of the ability of good 
descriptions to both broaden the range of applicability of a 
program / and simplify the program structure. 

1.8 IMPLICATIONS FOR HUMAN PERCEPTION 

My belief that the rules which govern the Interpretation 
of a line drawing should be simple is based on the subjective 
Impression that little abstraction or processing of any type 
seems to be required for me to be able to recognize the 
shadows, object edges, etc. In such a drawing, In cases where 
the drawing Is reasonably simple and complete. I do not 
believe that human perceptual processes necessarily resemble 
the processes In my program, but there are various aspects of 
my solution which appeal to my Intuition about the nature of 
that portion of the problem which Is Independent of the type 
of percelver. I think It Is significant that my program Is 
as simple as It Is, and that the Information stored In It Is 
so independent of particular objects. Back-up Is not 
necessary In general; the system works for picture fragments 
as well as for entire scenes; the processing time required 
Is proportional to the number of line segments and not an 
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exponential function of the number; all these facts lead me 
to believe that my research has been In the right directions. 

Clearly there are considerable obstacles to be overcome 
In extending this work to general scenes. For simple curved 
objects such as cyl Inders, spheres / cones/ and conic 
sections/ there should be no particular problem In using the 
type of program I have written. (For a quite different 
approach to the handling of curved objects/ see Horn 1970.) 
I also believe that It will be possible to handle somewhat 
more general scenes (for Instance scenes containing 
furniture/ tools and household articles) by approximating the 
objects In them by simplified "envelopes" which preserve the 
gross form of the objects yet which can be described In terms 
like those I have used. In my estimation such processing 
cannot be done successfully until the problem of 
reconstructing the Invisible portions of the scene Is solved. 
This problem Is Intimately connected with the problem of 
using the stored description of an object to guide the search 
for Instances of this object/ or similar objects In a scene. 
The ability to label a line drawing In the manner I describe 
greatly simplifies the specification and hopefully will 
simplify the solution of these problems. Chapter 8 deals 
with natural extensions of my program which I believe will 
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lead toward the eventual solution of these problems. 
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2.0 QUICK SYNOPSIS 

This chapter provides a quick look at some of the 
technical aspects of my work. All topics covered here are 
treated either In greater detail or from a different 
perspective In later chapters. For a hurried reader this 
chapter provides a map to the rest of the paper, and enough 
background to understand a later chapter without reading all 
the intervening ones. 

2.1 THE PROBLEM 

In what follows I frequently make a distinction between 
the scene Itself (objects, table, and shadows) and the 
retinal representation of the scene as a two-dimensional line 
drawing. I will use the terms vertex, edge and surface to 
refer to the scene features which map into junction, line and 
region respectively In the line drawing. 

Our first subproblem is to develop a language that 
allows us to relate these two worlds. I have done this by 
assigning names called labels to lines In the line drawing, 
after the manner of Huffman (Huffman 1971) and Clowes (Clowes 
1971). Thus, for example, in figure 2.1 line segment J1-J2 
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is labeled as a shadow edge, line J2-J3 Is labeled as a 
concave edge, line J3-JU is labeled as a convex edge, line 
Ji*-J5 is labeled as an obscuring edge and line J12-J13 Is 
labeled as a crack edge. Thus, these terms are attached to 
parts of the drawing, but they designate the kinds of things 
found in the three-dimensional scene. 

When we look at a line drawing of this sort, we usually 
can easily understand what the line drawing represents. In 
terms of a labeling scheme either (1) we are able to assign 
labels uniquely to each line, or (2) we can say that no such 
scene could exist, or (3) we can say that although It Is 
Impossible to decide unambiguously what the label of an edge 
should be, it must be labeled with one member of some 
specified subset of the total number of labels. What 
knowledge Is needed to enable the program to reproduce such 
labeling assignments? 

Huffman and Clowes provided a partial answer In their 
papers. They pointed out that each type of junction can only 
be labeled in a few ways, and that if we can say with 
certainty what the label of one particular line Is, we can 
greatly constrain all other lines which Intersect that line 
segment at Its ends. As a specific example, If one branch of 
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an L junction is labeled as a shadow edge, then the other 
branch must be labeled as a shadow edge as well. 

Moreover, shadows are directional, i.e. in order to 
specify a shadow edge, it must not only be labeled "shadow" 
but must also be marked to indicate which side of the edge Is 
shadowed and which side is Illuminated. Therefore, not only 
the type of edge but the nature of the regions on each side 
can be constrained. 

These facts can be illustrated In a jigsaw puzzle 
analogy, shown in figure 2.2. Given the five different edge 
types I have discussed so far, there are seven different ways 
to label any line segment. This Implies that if all line 
labels could be assigned independently there would be 7* = U9 
different ways to label an L, 7 3 » 3*3 ways to label a 
three-line junction, etc. In fact there are only 9 ways in 
which real scene features can map Into Ls on a retinal 
projection. Table 2.1 summarizes the ways In which junctions 
can be assigned label ings from this set. In figure 2.3, I 
show all the possible label ings for each junction type, 
limiting myself to vertices which are formed by no more than 
three planes (trihedral vertices) and to junctions of five or 
fewer lines. In Chapter 3 I explain how to obtain the 
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junctions in figure 2.3; I do not expect that it should be 
obvious to you how one could obtain these junctions. In 
general, for clarity, I have tried to use the word labeling 
to refer to the simultaneous assignment of a number of line 
labels. Labels thus refer to line interpretations, and 
label ings refer to junction or scene Interpretations. 

2.2 SOLVING THE LABEL ASSIGNMENT PROBLEM 

Labels can be assigned to each line segment by a tree 
search procedure. In terms of the jigsaw puzzle analogy, 
imagine that we have the following Items: 



drawii;. ?K^ rd !I lth channe1s cu * to represent the line 
drawing; the board space can accept only L pieces at each 

? a " rE! r ?. th V ln ? d T* lng has an L ' only ARROW pieces 
where the line drawing has an ARROW, etc. Next to each 

i»E? 2" ar 5 S hr ?« b »ns, marked "junction number", "untried 
labels", and "tried labels". " 

?h« w fu1 i se ^ of pIeces for « v ery space on the board. If 
the line drawing represented by the board has five Ls then 
there are five full sets of L pieces with nine pieces In each 

3. A set of junction number tags marked Jl, J2, J3 
..., Jn, where n is the number of junctions on the board. 

and n k .' * COUnter whlch can be set to «"V number between 1 
follols? tree Search P rocedu »-e can then be visualized as 
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Step 1: Name each junction by placing a junction number tag 
in each bin marked "junction number". 

Step 2: Place a full set of the appropriate type of pieces in 
the "untried labels" bin of each junction. 

Step 3: Set the counter to 1. From here on in Nc will be 
used to refer to the current value of the counter. Thus if 
the counter is set to 6, then J(Nc) » 6. 

Step k: Try to place the top piece from the "untried labels" 
bin of junction J(Nc) In board space J(Nc). There are 
several possible outcomes: 

»n J*! ! T t ? e pfec ? cat l be P laced <*.e. the piece matches 
all adjacent pieces already placed. If any), then 

repeat Ste*'** ' f Nc < n ' increase the counter by one and 

A2. If Nc - n, then the pieces now on the board 
th?s 6 |s nt rC ne ?° Sslb1e 1abe1ln « f or the line drawing. If 

. . ., *• Write down or otherwise remember the 
label ing, and 

n-th •Wri*H I !;K,J r M n ;!f er th ! plece In space n b * c * »nto the 
n tn untried labels" bin, and 

ill. Go to Step 5. 

iahoic»\iiV!l e PfeCe £ annot be Placed, put it in the "tried 
labels" bin and repeat Step k. 

kC. If there are no more pieces In the "untried labels" 
oin, tnen 

,, .. C2 ' J f L Nc * l * we nave found all (If any) possible 
label ings, and the procedure Is DONE. 

C2. Otherwise, go to Step 5. 

Step 5: Do all the following steps: 
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"t-rioH , ah i iv, T T fer L an the P |ece s from the Mc-th 
tried labels" bin into the Nc-th "untried labels" bin, and 

"tried labels^bin^nd^ '"* "^ ln SPaCe ^ lnto ' ts 

HI. Set the counter to Nc-1, and go to Step k. 

To see how this procedure works In practice, see figure 
2.k. For this example assume that the pieces are piled so 
that the order in which they are tried Is the same as the 
order in which the pieces are listed in figure 2.3. The 
example Is carried out only as far as the first labeling 
obtained by the procedure. There is, of course, at least one 
other labeling, namely the one we could assign by Inspection. 
The "false" labeling found first could be eliminated In this 
case by a program if It knew that R3 Is brighter than Rl or 
that R2 is brighter than Rl. It could then use heuristics 
which only allow it to fit a shadow edge In one orientation, 
given the relative Illumination on both sides of a line. 
However, if the object happened to have a darker surface than 
the table, this heuristic would not help. 

Clearly this procedure leaves many unsolved problems. 
In general there will be a number of possible labelings from 
which a program must still choose one. What rules can It use 
to make the choice? Even after choosing a labeling, in order 
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to answer questions (about the number of objects In the 
scene, about which edges are shadows, about whether or not 
any objects support other objects, etc.) a program must use 
rules of some sort to deduce the answers from the Information 
It has. 

I will argue that what Is needed to find a single 
reasonable Interpretation of a line drawing is not a more 
clever set of rules or theorems to relate various features of 
the line drawing, but merely a better description of the 
scene features. In fact, it turns out that we can use a 
parsing procedure which involves less computation than the 
tree search procedure. 

2.3 BETTER EDGE DESCRIPTION 

So far I have classified edges only on the basis of 
geometry (concave, convex, obscuring or planar) and have 
subdivided the planar class Into crack and shadow 
sub-classes. Suppose that I further break down each class 
according to whether or not each edge can be the bounding 
edge of an object. Objects can be bounded by obscuring 
edges, concave edges, and crack edges. Figure 2.5 shows the 
results of appending a label analogous to the "obscuring 
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edge" mark to crack and concave edges. This approach is 
similar to one first proposed by Freuder (Freuder 1971a). 

Each region can also be labeled as belonging to one of 
the three following classes: 

I - Illuminated directly by the light source. 

SP - A projected shadow region; such a region would be 
illuminated if no object were between It and the light 
source. 

SS - A self-shadowed region; such a region Is oriented 
away from the light source. 

Given these classes / I can define new edge labels which 
also include information about the lighting on both sides of 
the edge. Notice that In this way I can Include at the edge 
level, a very local level, Information which constrains all 
edges bounding the same two regions. Put another way, 
whenever a line can be assigned a single label which Includes 
this lighting information, then a program has powerful 
constraints for the junctions which can appear around either 
of the regions which bound this line. 
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Figure 2.6 Is made up of tables which relate the region 
Illumination types which can occur on both sides of each edge 
type. For example, If either side of a concave or crack edge 
Is Illuminated, both sides of the edge must be Illuminated. 

These tables can be used to expand the set of allowable 
junction labels; the new set of labels can have a number of 
entries which have the same edge geometries but which have 
different region illumination values. It is very easy to 
write a program to expand the set of label Ings; the 
principles of its operation are (1) each region In a given 
junction labeling can have only one Illumination value of the 
three, and (2) the values on either side of each line of the 
junction must satisfy the restrictions In the tables of 
figure 2.6. 

An Interesting result of this further subdivision of the 
line labels Is that, with four exceptions, each 
shadow-causing junction has only one possible illumination 
parsing, as shown in figure 2.7. Thus whenever a scene has 
shadows and whenever a program can find a shadow causing 
junction in such a scene, It can greatly constrain all the 
lines and regions which make up this junction. In figure 2.7 
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I have also marked each shadow edge which Is part of a 
shadow-causing junction with an "l m If the arrow on the 
shadow edge points counter-clockwise and an "R" If the arrow 
points clockwise. No "L" shadow edge can match an "R» shadow 
edg e/ corresponding to the physical fact that It Is 
impossible for a shadow edge to be caused from both of Its 
ends. 

There are two extreme possibilities that this 
partitioning may have on the number of junction label Ings now 
needed to describe all real vertices: 

(1) Each old junction label which has n concave edges, m 
crack edges, p clockwise shadow edges, q counterclockwise 
shadow edges, s obscuring edges and t convex edges will have 
to be replaced by (20)" (6r<3)P(3)<lc9) S (g)* new junctions, or 

(2) Each old junction will give rise to only one new 
junction (as In the shadow-causing junction cases). 

If (1) were true then the partition would be worthless, 
since no new Information could be gained. if (2) were true, 
the situation would be greatly Improved, since In a sense all 
the much more precise Information was Implicitly Included In 
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the original junctions but was not explicitly stated. 
Because the Information is now more explicitly stated, many 
matches between junctions can be precluded; for example, if 
in the old scheme some line segment LI of junction label Ql 
could have been labeled concave, as could line segment L2 of 
junction label Q2, a line joining these two junctions could 
have been labeled concave. But in the new scheme, if each 
junction label gives rise to a single new label, both LI and 
L2 would take on one of the twenty possible values for a 
concave edge. Unless both LI and L2 gave rise to the same 
new label, the line segment could not be labeled concave 
using Ql and Q2. The truth lies somewhere between the two 
extremes, but the fact that it is not at the extreme of (1) 
means that there is a net improvement. In Table 2.2 I 
compare the situation now to cases (1) and (2) above and also 
to the situation depicted In Table 2.1. 

I have also used the better descriptions to express the 
restriction that each scene Is assumed to be on a horizontal 
table which has no holes In It and which Is large enough to 
fill the retina. This means that any line segment which 
separates the background (table) from the rest of the scene 
can only be labeled as shown In figure 2.8. Because of this 
fact the number of junction labels which could be used to 
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label junctions on the scene/background boundary can be 
greatl y restricted. 

The value of a better description should be Immediately 
apparent. In the old classification scheme three out of the 
seven line labels could appear on the scene/background 
boundary, whereas in the new classification, only seven out 
of fifty labels can occur. Moreover, since each junction 
must have two of its line segments bounding any region, the 
fraction of junctions which can be on the scene/background 
boundary has improved roughly from (3/7)(3/7) - 9/1*9 « 18. k% 
to (7/57X7/57) « U9/3U9 « 1.6*. The results of these 
improvements will become obvious in the next section. 

2.k PROGRAMMING CONSEQUENCES 

There are so many possible labels for each type of 
junction that I decided to begin programming a labeling 
system by writing a sort of filtering program to eliminate as 
many junction labels as possible before beginning a tree 
search procedure. 
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The filter procedure depends on the following 
observation, given in terms of the jigsaw puzzle analogy: 

Suppose that we have two junctions, Jl and J2 which are 
joined by a line segment L-J1-J2. Jl and J2 are 
represented by adjacent spaces on the board and the 
possible labels for each junction by two stacks of 
pieces. Now for any piece M in Jl's stack either (1) 
there is a matching piece N in J2*s stack or (2) there 
is no such piece. If there is no matching piece for M 
then M can be thrown away and need never be considered 
again as a possible junction label. 

The filter procedure below Is a method for 
systematically eliminating all junction labels for which 
there can never be a match. All the equipment Is the same as 
that used In the tree search example, except that this time I 
have added a card marked "junction modified" on one side and 
"no junction modified" on the other. 



Step 1: Put a junction number tag between 1 and n in 
each junction number" bin. Place a full set of pieces 
in the "untried labels" bin of each junction. 

Step 2: Set the counter to Nc « 1, and place the card so 
that It reads "no junction modified". 

Step 3: Check the value of Nc: 

m ^trt , J ,. ,f u Nc ■ n * *' and tne card reads »ho junction 
modified" then go to SUCCEED. 

m ^^? # ^M , !u Nc * n * I' and the card reads "Junction 
modified" then go to Step 2. (At least one piece was 
thrown away on the last pass, and therefore it Is 
possible that other pieces which were kept only because 
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this piece was present will now have to be thrown away 
a 1 so. ) 

C. Otherwise, go to Step U. 

Step d: Check the "untried labels" bin of junction 
Jv Nc; : 

ii . **-. ! f L the M e are no P |e c e s left in the Nc-th 
"untried labels" bin, then 

••*. t u , u A K, ,f th ere are no pieces in the Nc-th 
tried labels" bin, go to FAILURE. 

m^ -h ..«. t A l\ ? tf ? e » w I? e/ transf er the pieces from the 

YShli. *k e ]3 Jl e S bln back lnto the Nc ^h "untried 
labels bin, add 1 to the counter (Nc) and go to Step 3. 

. . , B ; I J f the re are pieces left in the Nc-th "untried 

\t Vn S Ko 1 ^ ta S* th , e tOP Plece from the blrt •"<* Place 
it in the board, and go to Step 5. 

Step 5: Check the spaces adjacent to space Nc: 

n . J\ ,f ^e Piece In the Nc-th space has matching 
P eces in each neighboring junction space, transfer the 
Piece from space Nc Into the Nc-th "tried labels" bin, 
and transfer the pieces from the neighboring spaces and 

-StWftiS" &s ,abe,s " b,ns back ,n? ° * Mr 

B. If there are empty neighboring spaces, then 

x uu , B1 . - . ,f the re are no more junctions in the 
neighboring "untried labels" bins which could fit with 

i^ P,eC ?K ln SPaCe Nc ' then that P'ece is not a possible 
label. Throw It away, and arrange the card to read 
"junction modified" If it doesn'? already. 

labeU" ni?h ,.Ir?, pl ?^ S from t tn e neighboring "untried 

l!h!.i!f.»5 a +1 elther a p,ece f,ts or the pile is 
exhausted, and then go to Step 5 again. 

SUCCEED: The pieces In the "untried labels" bins of each 
junction have passed the filtering routine and 
constitute the output of this procedure. 
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FAILURE: There Is no way to label the scene given the 
current set of pieces. 



In the program I wrote / I used a somewhat more complex 
variation of this procedure which only requires cne pass 
through the junctions. This procedure Is similar to the one 
used to generate figure 2.9, and is described below. 

When I ran the filter program on some simple line 
drawings, I found to my amazement that the filter procedure 
yielded unique labels for each junction in most cases! In 
fact in every case I have tried, the results of this 
filtering program are the same results which would be 
obtained by running a tree search procedure, saving all the 
label ings produced, and combining all the resulting 
possibilities for each junction. In other words, the filter 
program in general eliminates all labels except those which 
are part of some tree search labeling for the entire scene. 

It is not obvious that this should be the case. For 
example, if this filter procedure Is applied to the simple 
line drawing shown in figure l.k using the old set of labels 
given in figure 2.3, It produces the results shown In figure 
2.9. In this figure, each junction has labels attached which 
would not be part of any total labeling produced by a tree 
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search. This figure is obtained by going through the 
junctions in numerical order and: 

(1) Attaching to a junction all labels which do not 
conflict with junctions previously assigned; I.e. if it is 
known that a branch must be labeled from the set S, do not 
attach any junction labels which would require that the 
branch be labeled with an element not In S. 

(2) Looking at the neighbors of this junction which have 
already been labeled; If any label does not have a 

corresponding assignment for the same branch, then eliminate 
it. 

(3) Whenever any label Is deleted from a junction, look 
at all its neighbors In turn, and see if any of their labels 
can be eliminated. If they can, continue this process 
Iteratlvely until no more changes can be made. Then go on to 
the next junction (numerically). The junction which was 
being labeled (as In step (1)) at the time a label was 
eliminated (struck out In the figure) Is noted next to each 
eliminated label in figure 2.9. 
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The fact that these results can be produced by the 
filtering program says a great deal about line drawings 
generated by real scenes and also about the value of precise 
descriptions. There Is sufficient local Information In a 
line drawing so that a program can use a procedure which 
requires far less computation than does a tree search 
procedure. To see why this Is so, notice that If the 
description the program uses Is good enough, then many 
junctions must always be given the same unique label in each 
tree search solution; the filtering program needs to find 
such a label only once, while a tree search procedure must go 
through the process of finding the same solution on each pass 
through the tree. 

Quite remarkably, all these results are obtained using 
only the topology of line drawings plus knowledge about which 
region is the table and about the relative brightness of each 
region. No use is made (yet) of the direction of line 
segments (except that some directional information Is used to 
classify the junctions as ARROWs, FORKs, etc.), nor Is any 
use made of the length of line segments, mlcrostructure of 
edges, lighting direction or other potentially useful cues. 



SECTION 2.5 62 
2.5 HANDLING BAD DATA 

So far I have treated this subject as though the program 
would always be given perfect data. In fact there are many 
types of errors and degeneracies which occur frequently. 
Some of these can be corrected through use of better line 
finding programs and some can be eliminated by using stereo 
lnformation / but I would like to show that the program can 
handle various problems by simple extensions of the list of 
junction labels. In no case do I expect the program to be 
able to sort out scenes that people cannot easily understand. 

Two of the most common types of bad data are (1) edges 
missed entirely due to equal region brightness on both sides 
of the edge, and (2) accidental alignment of vertices and 
lines. Figure 2.10 shows a scene containing instances of 
each type of problem. 

The program handles these problem junctions by 
generating labels for them, just as it does for normal 
junctions. It is important to be able to do this, since it 
is in general very difficult to identify the particular 
junction which causes the program to fail to find a parsing 
of the scene. Even worse, the program may find a way of 
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2.5 HANDLING BAD DATA 

So far I have treated this subject as though the program 
would always be given perfect data. In fact there are many 
types of errors and degeneracies which occur frequently. 
Some of these can be corrected through use of better line 
finding programs and some can be eliminated by using stereo 
Information, but I would like to show that the program can 
handle various problems by simple extensions of the list of 
junction labels. In no case do I expect the program to be 
able to sort out scenes that people cannot easily understand. 

Two of the most common types of bad data are (1) edges 
missed entirely due to equal region brightness on both sides 
of the edge / and (2) accidental alignment of vertices and 
lines. Figure 2.10 shows a scene containing Instances of 
each type of problem. 

The program handles these problem junctions by 
generating labels for them / just as it does for normal 
junctions. It is important to be able to do this, since it 
Is in general very difficult to Identify the particular 
junction which causes the program to fail to find a parsing 
of the scene* Even worse, the program may find a way of 
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interpreting the scene as though the data were perfect and It 
would then not even get an Indication that It should look for 
other Interpretations. 

2.6 ACCIDENTAL ALIGNMENT 

Chapter 7 treats a number of different types of 
accidental alignment. Figure 2.11 shows three of the most 
common types which are Included In the program's repertoire; 
consider three kinds of accidental alignment: 

(1) cases where a vertex apparently has an extra line 
because an edge obscured by the vertex appears to be part of 
the vertex (see figure 2.11a)/ 

(2) cases where an edge which is between the eye and a 
vertex appears to intersect the vertex (see figure 2. lib)/ 
and 

(3) cases where a shadow Is projected so that It 
actually does intersect a vertex (see figure 2.11c). 
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2.7 MISSING LINES 

I have not attempted to systematically Include all 
missing line possibilities, but have only Included labels for 
the most common types of missing lines. I require that any 
missing line be in the interior of the scene; no line on the 
scene/background boundary can be missing. I also assume that 
all objects have approximately the same reflectivity on all 
surfaces. Therefore, If a convex line is missing, I assume 
that either both sides of the edge were illuminated or that 
both were shadowed. I have not really treated missing lines 
in a complete enough way to say much about them. There will 
have to be facilities in the program for filling In hidden 
surfaces and back faces of objects before missing lines can 
be treated satisfactorily. 

In general the program will report that It Isunable to 
label a scene if more than a few lines are missing and the 
missing line labels are not included in the set of possible 
junction labels. This Is really a sign of the power of the 
program, since if the appropriate labels for the missing line 
junctions were included, the program would find them 
uniquely. As an example, the simple scene In figure 2.12 
cannot be labeled at all unless the missing line junctions 
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are Included. 

2.8 REGION ORIENTATIONS 

Regions can be assigned labels which give quantized 
values for region orientations In three dimensions. These 
labels can be added to the junction labels In very much the 
same way that the region Illumination values were added. It 
is impossible to do justice to the topic here, but region 
orientations are treated In considerable detail in Chapter 8. 
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heuristics find a "plausible" Interpretation If required. 
For example, one heuristic eliminates interpretations that 
involve concave objects in favor of ones that involve convex 
objects, and another prefers interpretations which have the 
smallest number of objects; this heuristic prefers a shadow 
interpretation for an ambiguous region to the Interpretation 
of the region as a piece of an object. 

In this chapter I show how to express the first type of 
knowledge, and give hints about some of the others. A large 
proportion of my energy and thought has gone Into the choice 
of the set of possible line labels and the sets of possible 
junction labels. In this I have been guided by experiment 
with my program, since there are simply too many labels to 
hand simulate the program's reaction to a scene. The 
program, the set of edge labels, and the sets of junction 
label ings have each gone through an evolution Involving 
several steps. At each step I noted the ambiguities of 
Interpretation which remained, and then modified the system 
appropriately. 

The changes have generally Involved (1) the subdivision 
of one or more edge labels Into several new labels embodying 
finer distinctions, and (2) the recomputat ion of the junction 
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3.0 TRIHEDRAL JUNCTION LABELS 

The knowledge of this system Is expressed In several 
distinct forms: 

(1) A list of possible junction labels for each type of 
junction geometry includes the a priori knowledge about the 
possible three dimensional interpretations of a junction. 

(2) Selection rules which use junction geometry, 
knowledge about which region is the table, and region 
brightness. These can easily be extended to use line segment 
directions to find the subset of the total list of possible 
junction label ings which could apply at a particular junction 
in a 1 ine drawing. 

(3) A program to find the possible label Ings; It knows 
how to systematically eliminate impossible combinations of 
labels In a line drawing and, as such, contains Implicit 
knowledge about topology. 

(k) Optional heuristics which can be Invoked to select 
a single labeling from among those which remain after all the 
other knowledge In the program has been used. These 
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parts so that the types of trihedral vertex can be 
characterized by the octants of space around the vertex which 
are filled by solid material (Huffman 1971). 

Dowson (Dowson 1971a) went a little further In 
discussing how one could write an algorithm to find all 
possible trihedral junctions and their labels (using the 
simple three-label model of Huffman and Clowes). In fact he 
never used his system to generate every class of junction 
geometry but was satisfied to show that it could generate the 
twelve labels which Huffman and Clowes originally used. 
These twelve labels represent four different ways of filling 
In the octants (where I have not counted ways of filling the 
octants which differ only by rotation as different). 

Dowson 1 s scheme Is useful for visualizing how to 
generate the ten different ways of filling the octants which 
I use. Consider the general Intersection of three planes as 
shown In figure 3.1. These planes divide space into octants, 
which can be uniquely identified by three-dimensional binary 
vectors (x y z) where the x, y, and z directions are 
specified as shown. The vectors make It easy to describe the 
various geometries precisely. I can then generate all 
possible geometries and non-degenerate views by Imagining 
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label lists to include these new distinctions. In each case 
I have been able to test the new scheme to make sure that it 
solves the old problems without creating any unexpected new 
ones. For example/ the Initial data base contained only 
junctions which (1) represented trihedral vertices (i.e. 
vertices caused by the intersection of exactly three planes 
at a point In space) and (2) which could be constructed using 
only convex objects. The present data base has been expanded 
to include all trihedral junctions and a number of other 
junctions caused by vertices where more than three planes 
meet. 

Throughout this evolutionary process I have tried to 
systematically include In the lists every possibility under 
the stated assumptions. In this part of the system I have 
made only one type of judgement: If a junction can represent 
a vertex which is physically possible, Include that junction 
In the data base. 

3.1 EDGE GEOMETRY 

The first problem is to find all possible trihedral 
vertices. Huffman observed that three Intersecting planes, 
whether mutually orthogonal or not, divide space Into eight 
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various octants to be filled in with solid material. There 
are junctions which correspond to having 1, 2, 3, k, 5, 6, or 
7 octants filled. Figure 3.2 shows the twenty possible 
geometries that result from filling various octants, and in 
Appendix 1 1 have shown all the junction label Ings (not 
including shadow variations) which can result from the 
geometries in figure 3.2A. The result of this process is 196 
different junction labels. Figure 3.2B consists of the 
geometries which I have chosen not to use to generate 
junction labels. I have not included these geometries because 
each involves objects which touch only along one edge, and 
whose faces are nonetheless aligned, an extremely unlikely 
arrangement when compared to the other geometries. (In 
addition, some of the geometries are physically impossible 
unless one or more objects are cemented together along an 
edge or supported by invisible means.) 

The four geometries recognized by Huffman, Clowes, and 
Dowson correspond to my numbers 1, 3, 5, and 7 in figure 
3.2A. 

In figure 3.3 I show how the 20 different labels with 
type 3 geometry can be generated. Basically this process 
involves taking a geometry from figure 3.2A, finding all the 
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ways that the solid segments can be connected or separated, 
and finding all the possible views for each partitioning of 
the quadrants. To generate all the possible views one can 
either draw or Imagine the particular geometry as it appears 
when viewed from each octant. From some viewing octants the 
central vertex is blocked from view by solid material, and 
therefore not every viewing position adds new labelings. 
Appendix 1 is obtained by applying this process to each of 
the geometries In figure 3.2A. 

Whenever one of the regions at a junction could 
correspond to the background (I.e. the region Is not part of 
one of the three planes which intersect at the vertex) I have 
marked the region with a star <*) both in figure 3.3 and 
Appendix 1. Later I will show how to use this information to 
aid the selection rules. Only 37 out of the 196 labels In 
Appendix 1 can occur on the scene/background boundary. 

3.2 A USEFUL HEURISTIC 

This section previews the general discussion later 
concerning how to choose a single labeling If ambiguities are 
still left at the end of the regular program's operation. 
The regular program keeps every conceivable Interpretation. 



T---.- 



TAGE 88 






I 
I 

I 




TUNCTXOK 
i'ffEBNMX 



P0KK-3A 



TORK-3B 



F0EK-3C 







&1ABEUNG 



^ A 






dKnacrs 

AT 

vekhex. 



C"fcJECT& 
AT 

YHKTE*. 
AEE: 



OF 
OB3BCTCS^: 



A=(uOuCuo) 



A- C"0 



A-fuOuCuo) 



Csflio) 







FIGURE 33 



?AGE91 




(NOV 



Figure 3.4 



SECTION 3.2 90 

Clearly in some cases the scene Is essentially ambiguous/ 
I.e. human beings can interpret the scene in more than one 
way. 

Given the line drawing shown in figure 3.U, how can a 
program decide which of the interpretat ions, A, B, C or D ; is 
"correct"? In a picture there may be cues about how the 
objects should be separated in the details of the edges 
L-J1-J2 and L-J2-J3 of figure 3.k. But given only the line 
drawing of figure 3.4, the program will find the four 
interpretations listed. Because we generally prefer the 
scene interpretation which has the smallest number of convex 
objects, I have approprlatel y marked all junction labelings 
which include either concave edges (whether visible or not) 
or three-object edges. The output of the regular program Is 
then a single label or list of labels for each junction. 
Obviously if there is only a single label, then there Is 
nothing left to do. But If more than one label is left, it 
can purge labels corresponding to concave or three-edge 
junctions. 

This heuristic correctly labels all the scenes shown in 
figure 3.5A, but finds the wrong labeling for figure 3.5B 
because it always prefers to interpret scenes as made up of 
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convex objects/ and does not know enough to preclude the 
convex labeling In this case because object A In figure 3.5B 
has no support. Of course, for ambiguous scenes like figure 
3.<* the heuristic selects Interpretation A. 

3.3 SHADOWS AT TRIHEDRAL VERTICES 

To find all the variations of these vert i ces vhi ch 
Include shadow edges, first note that vertices with 1, 2, 6 
or 7 octants filled cannot cause shadows such that the shadow 
edges appear as part of the vertex. This can be stated more 
generally: In order to be a shadow-causing vertex (I.e. a 
vertex where the caused shadow edge radiates from the vertex) 
there must exist some viewing position for the vertex from 
which either two concave edges and one convex edge or one 
concave edge and two convex edges are visible. Consider the 
geometries listed In figure 3.2A. First, a shadow-causing 
edge must be convex. Second, unless there Is at least one 
concave edge adjacent to this convex edge, there can be no 
surface which can have a shadow projected onto It by the 
light streaming by the convex edge. Finally, a junction 
which has one convex and one concave edge must have at least 
one other convex or concave edge, since the convex edge and 
concave edge define at least three planes which cannot meet 
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at any vertex with only two edges. 

this immediately eliminates 73 out of 196ofthe labels 
in Appendix 1 from consideration. Appendix 2 shows the 
shadow edges (If any) which can occur at each of the 
remaining vertices. Appendix 2 Is constructed In the manner 
illustrated in figure 3.6; for each potential shadow-causing 
vertex, imagine the light source to be In each of the octants 
surrounding the vertex, and record all the resulting 
junctions. I have marked each shadow edge which Is part of a 
shadow-causing junction with an "L" or "R" according to 
whether the arrow on the shadow edge points counterclockwise 
or clockwise respectively. 

Any junction which contains either a clockwise shadow 
edge, marked "R," or a counterclockwise shadow edge, marked 
"L," is defined as a shadow-causing junction. The reason for 
distinguishing between the L and R shadow edges Is that this 
prevents labeling an edge as If It were a shadow caused from 
both its vertices. Without this device there would be no way 
to prevent figure 3.7 from being labeled as shown, with line 
segment L-A-B Interpreted as a shadow edge. (I use "L-" as a 
prefix to mean "line segment(s) joining the following 
points"; thus L-A-B Is the line segment joining points A and 
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B.) When the "L" and "R" marks are attached to each shadow 
causing junction, then the two shadow causing junctions at A 
and B in figure 3.7 no longer are compatible, and therefore 
the labeling shown will not be considered possible by the 
program. 

3.4 OTHER NON-DEGENERATE JUNCTIONS 

I now must describe vertices which do not fall into the 
categories I have described so far. These include (1) all 
the rest of the combinations that shadow edges can form and 
(2) obscured edges. 

In figure 3.8A I show all the other non-degenerate 
vertices which involve shadow edges, and In figure 3.8B I 
show al 1 the obscured edges. 

Later I return to the topic of junction labels and show 
how it Is possible to also include junctions representing 
common degeneracies and accidental alignments as well as 
junctions with missing lines. In the degenerate cases I do 
not include every labeling possibility; Instead I Include 
the most common occurrences using certain observations about 
junctions. This is important since I do not want to limit the 
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program to any particular set of objects. Fortunately 
certain types of junctions are rare no matter what types of 
objects are In a scene; for example, many junctions can only 
occur when the eye, light and object are aligned to within a 
few degrees / and when these junctions also contain unusual or 
aligned edges the combined likelihood of the junctions Is low 
enough so that they can be safely omitted. As shown In 
Chapter 7, the program can still give Information about 
junctions even If they do not have proper label Ings listed In 
the data base, provided that not too many of these occur 
together In a single scene. Moreover, this approach Is 
reasonable, since any additional ability to use stereo Images 
or to move the eye or range-finding ability will allow a 
program to disambiguate most of these types of features. 

3.5 A CLASS OF DEGENERACIES 

As a final topic, I Include one type of degeneracy which 
cannot be resolved by eye motion or stereo. This type of 
degeneracy results when the light source Is placed In the 
plane defined by one of an object's faces. In this case, 
shadows are aligned with edges to produce junctions which are 
unlabelable given only the normal set of labels described so 
far. Two examples of such alignment are shown in figure 3.9A 
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The grand total number of legal trihedral junctions 
listed in this chapter is 505. The interesting thing in my 
estimation is that the number of junction labels, while 
fairly large, is very small compared to the number of 
possibilities if the branches of these junctions were labeled 
independently; moreover, even though I have not yet shown 
how to include various degeneracies and alignments, I believe 
that the set I have described already is sufficient for most 
scenes which a person would construct out of plane-faced 
objects, provided that he did not set out to deliberately 
confuse the program. 

Since It may not be obvious what types of common 
vertices are non-trihedral, figure 3.10 contains a number of 
such vertices. Later sections show how to handle all of 
them. 



SECTION 3.5 104 

and figure 3.9B and a complete listing of this type of 
junction is found in Appendix 3. I have excluded cases where 
a shadow edge is projected directly onto an edge of some 
other type (as In figure 3.9C). These cases are excluded 
since they would require me to define new edge labels which 
are of very limited value, although there Is no technical 
difficulty in defining such edges and junctions. I also have 
excluded, for the time being, cases like the one shown in 
figure 3.9D, since the two junctions marked only appear to be 
T junctions when the eye is In the plane defined by the light 
source and the shadow-causing edge (L-A-B or L-C-D In figure 
3.9D). If the eye is moved to the right, the shadow-causing 
junctions change to ARROWS or FORKs as illustrated In figure 
3.9E. In contrast, notice that for the scenes shown In 
figures 3.9A and 3.9B, no change In eye position can make any 
difference in the apparent geometry of the shadow-causing 
junctions. 

Later (in Chapter 6) I consider some of the common 
non-trihedral junctions which the program is likely to 
encounter. Some of these require me to define extra labels. 
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k.Q COMPLETING THE REGULAR DATA BASE 

It would be hard to devise a program which could start 
with a few pieces of Information and eventually yield the 
list of junctions described in Chapter 2. Moreover even If 
such a program were written (which would Indeed be 
theoretically interesting), it would be rather pointless to 
generate labels with it every time the labels are needed In 
an analysis. Instead the generating program could run once 
and save its results in a table. In this form the junction 
labelings table is a sort of compiled knowledge, computed 
once using a few general facts and methods. The knowledge h 
the current program is almost totally In this compiled form; 
this Is the reason for its rapid operation, but I have paid a 
price for this speed in that I require a large amount of 
memory (about U,000 words) to store the junction labelings. 
(All the rest of the labeling program occupies only about 
4000 words of memory even though It is written in 
MICRO-PLANNER and LISP, neither of which are particularly 
noted for space efficiency.) 
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k.l REGION ILLUMINATION ASSIGNMENTS 

Given tables of allowable region Illumination values 
(figure 1.6), It Is easy to show how to write a program which 
expands the data base to Include this Information. Suppose 
that I wish to expand the labeling of the junction shown In 
figure k.l to Include region Illumination values. As coded 
for the data base / this labeling Is; 

(OCRM PLUS OCLM SHCCL) 

where OCRM stands for OCclude Right Minus (see L-J-A In 
figure k.l), PLUS represents the convex edge (see L-J-B In 
figure k.l), OCLM stands for OCclude Left Minus (see L-J-C In 
figure k.l), and SHCCL stands for SHadow Counterclockwise 
type L (see L-J-D In figure k.l). 

Each of these edges can separate regions which have the 
following values (the first element Is the value of the 
region located counterclockwise with respect to the edge, the 
second element Is the value of the region located clockwise 
with respect to the edge): 
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lists LI and L2 that I defined earlier: 

ILLUMINE (LI, L2) 

- ((I ID (II SS) 

(SP SP SP) (SP SP SS) 

(SP SS I) (SP SS SP) (SP SS SS) 

(SS SP SP) (SS SP SS) 

(SS SS I) (SS SS SP) (SS SS SS)) 

* Lk. 

Ik Is a list of triples which gives all the possible 
values for region illuminations in the regions RO, Rl, and R2 
in figure U. 1. To include R3, compute L5: 

ILLUMINE Uk, LI) 

* ((I I I I) 

(I I SS SP) (I I SS SS) 

(SP SP SP SP) (SP SP SP SS) 

(SP SP SS SP) (SP SP SS SS) 

(SP SS I I ) 

(SP SS SP SP) (SP SS SP SS) 

(SP SS SS SP) (SP SS SS SS) 

(SS SP SP SP) (SS SP SP SS) 

(SS SP SS SP) (SS SP SS SS) 

(SS SS I I) 

(SS SS SP SP) (SS SS SP SS) 

(SS SS SS SP) (SS SS SS SS)) 

- L5. 

Now I only need to Include the pairs for the line L-J-D, 
the shadow edge. Notice that very few of the possibilities 
for illumination can agree with R3 when R3 is forced to be a 
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<The list of region illumination pairs for OCRM or 0CLM> 
« LI 
» ((I I) (SP SP) (SP SS) (SS SP) (SS SS)). 

<The list of region illumination pairs for PLUS> 
« L2 
» ((I I ) (I SS) (SS I) 

(SP SP) (SP SS) (SS SP) (SS SS)). 

<The list of region illumination pairs for SHCCL> 
« L3 
« ((SP I)). 

ILLUMINE is a function which takes two input lists as 
arguments/ and returns a single output list. Each member of 
the output list is formed as follows: take a member of the 
second input list whose first element is the same as the last 
element of some member of the first input list. Concatenate 
these two and eliminate the duplication of the matching 
element. The ouput list Is made up of every possible element 
which can be formed in this manner. Whi le a verbal 
description may be somewhat difficult to understand/ the 
function is not really very complicated/ and I think the 
following example should make its operation clear. Using the 
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type SP region: 

ILLUMINE (L5, L3) 

» ((I I SS SP I) 
(SP SP SP SP I) 
(SP SP SS SP I ) 
(SP SS SP SP I ) 
(SP SS SS SP I) 
(SS SP SP SP I ) 
(SS SP SS SP I) 
(SS SS SP SP I ) 
(SS SS SS SP I )) 

- L6. 

Now / to find the label Ings for this junction, the last 
condition requires that since the first and last elements of 
each labeling in L6 both refer to RO, their values must be 
the same. Therefore I apply function FINALIZE, which only 
keeps members of a list whose first and last elements are the 
same: 

FINALIZE (L6) « ((I I SS SP I)). 

This represents the only possible region Illumination 
labeling for this junction as shown in figure k. 2A. As I 
mentioned earlier, It Is true In general that shadow-causing 
junctions (and a number of other junctions Involving shadows) 
have only one possible region Illumination labeling. The 
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In order to include Illumination Information in the data 
base / I merely append the region illumination value names to 
the name of each label. Thus I subdivide each label type 
(except shadow edge labels) into a number of possibilities/ 
as shown in Table k.l. As I mentioned in Chapter 2, 
expanding the number of line labels does not increase the 
total number of junction labels as much as one might Imagine 
(see Table 2.2). 

Fully 268 of the 505 label ings listed in Chapter 3/ over 
half/ have only one possible region illumination 
interpretation! The largest possible number of illumination 
interpretations for any junction is 3 n , where n is the number 
of junction branches. A number of T junctions actually have 
27 interpretations (for example, this is true of any T made 
up of three occluding edges). 
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exceptions to this rule are shadow-causing junctions where 
one region segment of the junction Is obscured by the vertex 
which gives rise to the junction. To understand this 
distinction, try finding the region Illumination values for 
the junction In figure U.2B as an exercise, especially If you 
are not entirely clear about the operation of ILLUMINE and 
FINALIZE. You will need the list of possible region 
illumination pairs for L-V-A and L-V-D In figure U.2B; these 
edges can each be assigned any of the possible region 
i 1 lumi nation pairs: 



<The list of region illumination pairs for OCR edges (such as 
L-V-A) and OCL edges (such as L-V-D)> 

« ((I I) (I SP) (I SS) 
(SP I) (SP SP) (SP SS) 
(SS I) (SS SP) (SS SS)) 

* L7. 
Your answer should be: 



((I SS SP I I) 
(SP SS SP I SP) 
(SS SS SP I SS)) 



The answer Is Illustrated In figure t».2C. 
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MEW LABELS 



OCR-II, OCR-IS?, 

OCR-TSS, 
OCR-SfEI , OCR-SPSP, 

OCR-SPSS 
OCfR-SSI.OCR-SSS? 

OCR-Sg&g 



OCL-II, OCL-ISP 

OCL-t-SS, 
OdL-STl, OCU-SPS? 

ocR-s?sg 
OCL-SSi, odju-SSSP 
OCL-SSSS 



NE"W TOTAL 
57 



Table 4.1 
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U.2 SUMMARY OF THE DATA BASE 

Although there are 505 labels listed in Chapter 3, the 
actual number of elements in the label lists for each 
junction will be larger than we might expect, since different 
permutations of labels count as different elements in some of 
the lists. The total number of list elements needed to 
represent the 505 label ings is 717, and this number expands 
to 3256 when the region illumination information is added to 
the label ings. Table U.2 shows the number of elements in 
each list with and without region illuminated information. 
This table differs from Table 2.2 in that it includes only 
the differences in the list lengths which are caused by 
adding region illumination information. 

A little cleverness is required to avoid duplicate 
label ings when including the different permutations of X 
junctions. This is because some X junctions give rise to two 
elements in the X label ings list, while the rest add only one 
element. Figure U • 3B shows an X junction which requires two 
elements to be added to the list, while figure U.3C shows two 
label ings which each add only one element to the data base. 
Most shadow X junctions give rise to two elements in the data 
base, and most junctions without shadows give rise to one. 
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It Is now possible to describe how the program handles 
each junction It encounters: 

(1) If the junction is an L, ARROW, T, K, PEAK, X, KX, 
or KXX, It uniquely orders the junction's line segments (by 
choosing a particular line segment and considering the rest 
as ordered in a clockwise direction from this line segment). 

(2) If the junction is a FORK, MULTI or XX, it chooses 
one line segment arbitrarily. 

(3) It then fetches a list of labels which contains 
every possible set of assignments for the lines (excluding 
the possibilities of accidental alignments and degeneracies, 
and junctions with missing lines) and associates this list 
with the junction. 

It makes absolutely no difference whether the program 
obtains this list from a table (the compiled knowledge case) 
or whether it must perform extensive computations to generate 
the list (the generated knowledge case). Similarly, It does 
not matter at all that various members of the list bear a 
particular relation to each other, e.g. as In the case of a 
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provide a unique labeling of edge geometry.) 
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FORK junction/ where most elements of the list have two other 
elements which are permutations of the element. When I 
return to the issues of degeneracies, accidental alignments 
and missing lines, all I need to show is how the label ings 
corresponding to these cases can be added to the appropriate 
junction lists. The machinery to choose a particular element 
operates Independently of just what the label ings actually 
are. 

The only apparent exceptions are those labels marked to 
indicate that the vertices which cause them are either 
non-trihedral or concave or the result of alignment of 
surface and the light source. This information can be used 
optionally as the final step In the operation of the program 
If it is necessary to select a single labeling for an 
ambiguous junction. In such a case these marks enable the 
program to make a simple judgement about which 
interpretations are most likely. Of course if only single 
Interpretations remain before the final step, or if I do not 
care that some junctions are not uniquely specified, then the 
program does not need to use these heuristics at all. (Such 
a case occurs when I only wish to find edge geometries and do 
not care about region Illumination. Often ambiguous labels 
differ In the type of Illumination for various regions but 
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5.0 SELECTION RULES 

Now that I have shown how to generate a large number of 

possible labels for a junction, I will show how to go about 

eliminating all but one of them. The strategy for doing this 
l nvolves: 

(1) using selection rules to eliminate as many labels 
as possible on the basis of relatively local information such 
as region brightness or line segment directions/ and 

(2) using the main portion of the program to remove 
labels which cannot be part of any total scene labeling. 

5.1 REGION BRIGHTNESS 

If I know only that line segment L-A-B Is a line in a 
scene, then it can theoretically be assigned any of the 57 
possible labels. Once I know that L-A-B has an ARROW at one 
of its ends as shown in figure 5. IB, the number of 
possibilities drops to 19. Suppose that I know, In addition, 
the relative brightness of Rl and R2 in the neighborhood of 
L-A-B In figure 5.1C. There are three possibilities: (1) Rl 
Is darker than R2, (2) R2 Is darker than Rl, or (3) the 
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Each junction from Chapter 3 which has a star In one of 
Its segments Is listed separately from junctions which have 
the same geometry but which cannot occur on the 
scene/background boundary. Thus the list of /R ROW labels Is 
divided into ARROW-B, a list made up of those labels which 
can occur on the scene/background boundary, and ARROW-l, made 
up of those which must occur on the interior of a scene. The 
total list of junctions which can also appear In the Interior 
of a scene is found by appending ARROW-B to ARROW-l, since 
the scene/background label ings can appear on the interior of 
the scene as shown in figure 5.2. Table 5.1 lists the number 
of trihedral junction labels which can occur on the interior 
and on the scene/background boundary for each type of 
junction. Appendix k lists all of the junctions which can 
occur on the scene/background boundary Including region 
illumination information. To obtain Appendix if I have 
assumed that the light source is positioned in one of the 
four octants of space above the support surface. This 
restriction means that the background is guaranteed to always 
be I 1 lumlnated. 
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brightness of Rl is equal to the brightness of R2. 

If (1) Is true / I know for certain that if L-A-B Is a 
shadow edge/ then Rl must be the shadowed side and R2 the 
illuminated side. Obviously if (2) is true, then the 
opposite holds, I.e. R2 must be the shadowed side and Rl must 
be the illuminated side. If (3) is true, then It Is 
impossible for L-A-B to be a shadow edge at all. (If I happen 
to also know that each object in a scene has all its faces 
painted identically with a non-reflective finish, then I can 
also eliminate more labels. In this case, If (1) Is true, 
then L-A-B cannot be labeled as a convex edge with region Rl 
illuminated and R2 shadowed type SS, if (2) Is true, then 
L-A-B cannot be labeled as convex with R2 Illuminated and Rl 
shadowed type SS, and if (3) Is true, then neither of these 
labels is possible.) 

5.2 SCENE/BACKGROUND BOUNDARY REVISITED 

It is easy to find all the junctions which can occur 
around the scene/background boundary. All that Is necessary 
is to make a list of all the line segments which can occur 
along the boundary and then look for segments of junctions 
which are bounded by two members of this set. 
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Obviously, If I can determine which lines in the line 
drawing are part of the scene/background boundary, this 
knowledge can be used to great advantage. It Is, In fact, 
not difficult to determine this boundary; any of several 
strategies will work. Two examples are: 

(1) Look for regions which touch the edge of the field 
of view and append them all together, or 

(2) Find the contour which has the property that every 
junction lies on or inside it (see Mahabala 1969). 

Both of these methods require that the scene be 
completely surrounded by the background region or regions. 
As shown in figure 5.3, method (1) works even If the 
background Is made up of more than one region. 

Once the program has found which region Is the 
background region, it can also find how each junction is 
oriented on the scene/background boundary. Some junctions 
always appear in the same orientation; for example, ARROW 
and PEAK junctions can only be oriented so that the 
background region is the region whose angle Is greater than 
180 degrees, and K junctions can only have the region whose 
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angle Is 180 degrees as the background region (see Appendix 
k). 

Of course there is no way to easily define the 
orientations of FORK, XX, or MULTI junctions. However, as 
shown in figure 5. if, the L, T, X and KX junctions which 
appear on the scene/background boundary can be sorted 
according to which of their segments is the background 
region. 

Consider figure 5.5. Each of the L, T, and X junctions 
Is marked to indicate which orientation it has. Table 5.2 
shows that this distinction makes a significant reduction in 
the size of the starting list of label assignments for these 
junctions. 

5.3 EXTENDING THE SUPPORT SURFACE 

Consider a problem posed by the scene shown in figure 
5.6. If my labeling program is given this scene with the set 
of labels defined so far, the program will not find a unique 
labeling for L-C-D, even though It finds L-A-B to be a shadow 
edge, and therefore labels Rl as a projected shadow region. 
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At one time I thought that I would need to write a 
"demon" program which would check for shadow edges on 
the table, assert that such a shadow region Is coplanar 
with the table, and then eliminate any edges other than 
planar ones whenever such a region shares an edge with 
the illuminated portion of the table. This type of 
approach seemed rather ad hoc to me, and started me 
thinking about how I could Include region information as 
part of each junction label. There could be many added 
benefits to such an approach: It seemed clear that just 
as I was able to vastly reduce the number of labels from 
which to select possible ones by knowing that a junction 
was on the scene/background boundary, I should be able 
to reduce the number of labels for a junction which was 
interior to the scene but which had the table as one of 
Its region segments. 

Therefore I defined new labels as shown in Table 5.3 to 
denote any edge which has the table as one of its adjoining 
regions. Since I have restricted the light source to be in 
the quadrants of space above the support surface, I can be 
certain that any region which Is part of the table can never 
be self -shadowed, type SS. I have used this fact in 
constructing Table 5.3. Any edge which touches or obscures 
the table is marked by appending a "T" to Its name or 
printing a "T" next to the line segment. The old labels 
without "T" are understood to represent edges which do not 
have the table as either of their adjoining regions. The 
addition of these 2k edge types brings the total number of 
line labels to 81. 



?AGE 136 







•BT THE JXpGgpM*. 



X 



XHCoRKECT. 



X 

— t— 
SP 

POSSIBLE 

TO YROG&Wt 



FlGUKE 5^6 



SECTION 5.3 139 

The tables which show the allowable region Illumination 
pairs for these edges (analogous to figure 2.6) appear In 
Table 5.U. 

To update the lists of junction labels / I must add to 
the present set: 

(1) All the junctions listed In Appendix «♦, but with 
H T" printed next to both line segments which bound the region 
containing the star. (These regions can be part of the 
background of the scene/ i.e. the portion of the table which 
surrounds the scene and Is Illuminated.) Some of these 
junctions can also have other projected shadow regions which 
are part of the table/ so that "T" must be added to line 
segments other than the two bounding the starred region. 
These junctions are listed in Appendix 5. 

(2) All the junctions which can bound a projected 
shadow (type SP) region which is also part of the table. 

Table 5.5 shows the situation now for the relative 
numbers of junctions which can occur on the scene/background 
boundary. While the numbers of label ings possible If the 
branches were labeled Independently has Increased sharply 
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with the increase of the number of line labels from 57 to 81, 
the actual numbers of junction label ings has not changed for 
the scene/background boundary and has increased only 
moderately for the scene interior. 

The value of these additions to the data base is 
especially pronounced for scenes like figure 1.2 where the 
table surface accounts for seven of the Interior regions as 
well as the background region. In addition to the 
improvement for scenes of this sort, there are other 
benefits. Consider figure 5.7. How many objects are In this 
scene? 

Now look at figure 5.8. Given figure 5.7, my program 
will return both interpretations: the one we would usually 
expect (region R as the table, with object C resting on the 
table) or the interpretation shown in figure 5.8. Thus the 
new labels enable the program to make finer distinctions than 
it could before. Notice that we could also use the table 
information to make another heuristic rule: If there are two 
interpretations of an interior region, one as the table and 
one as an extra object, choose the table Interpretation. 
(This corresponds to choosing the simplest Interpretation, 
i.e. the one with the fewest objects.) 
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5.k DISCUSSION 

This section Is speculative; nothing In It Is critical 
to an understanding of my program. 

Underlying the previous section are some Important kinds 
of distinctions between levels of understanding which I 
believe are worth pursuing at greater length at this point. 
There are several levels of understanding which a program can 
have about a particular property of scene features (e.g. 
"this region is part of the table 11 ): 

(1) the first level of understanding is that the 
program be able to express the fact that a given portion of 
the scene does or does not have the property. As an example, 
until the program had the labels which labeled regions that 
were part of the table, it could not express the difference 
between the two possible Interpretations of figure 5.7. 

(2) The next series of levels are ones where the 
program recognizes more and more Instances of features which 
cannot have the property (and consequently recognizes more 
precisely where the property can apply). My program's hard 
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interpretations are possible/ my suspicion is that I pick the 
one I expect to be true, and on the basis of this expectation 
I then choose a test (or tests) to eliminate all the (n-1) 
other possibilities. After performing this test/ I then have 
knowledge which either supports my expectation or forces me 
to form or choose a new expectation. 

The curious fact about my perception is that I only see 
one interpretation at a time even when I know that a scene is 
ambiguous. (Take for example the reversing illusion which 
alternates between a vase and two faces in profile, depending 
upon which regions are viewed as figures and which are viewed 
as background (Koffka 1935)). 

Even when I have insufficient solid knowledge on which 
to base my interpretation of a scene, my expectation seems to 
carry the same force of conviction that solid knowledge 
would. Nonetheless/ i can change my Interpretations of 
scenes either when I am faced with new evidence (by a change 
in my relation to a scene or change in the scene) or If I am 
challenged about my Interpretation (Are you sure?). 
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knowledge ends at this level; for some cases Its 
understanding is sufficient to uniquely recognize a property, 
In other cases it is unable to select between two or more 
possibi 1 l t ies. 

(3) I believe that the next levels of understanding are 
characterized by the ability to define a critical test (or 
series of critical tests) which will allow a program to 
eliminate remaining possibilities until only one Is left. 
Such a test might be "If I remove the object In front, I will 
be able to see whether or not that region Is connected to the 
table surface" or "if I move to the right, and if that region 
is part of the table, then I should be able to see an edge at 
point (x,y)". I claim that this must be the next level of 
knowledge since many line drawings simply do not contain cues 
which allow a program (or a person) to decide between various 
posslbll It les. 

However, let me make a distinction between knowledge and 
expectation. Even If I am not allowed to make further tests, 
I still expect the scene to have a particular form. 
Moreover, I believe that this expectation, simulated by 
heuristic rules In my program, Is Instrumental In deciding 
just which critical tests I should make. For example, If n 
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5.5 AN EXAMPLE 

I have now shown how to use selection rules to narrow 
down the choices for junction labels on the basis of various 
kinds of cues from the line drawing. To give an idea of how 
much these rules help, look at figure 5.9. Next to each 
junction I have listed the numbers of labels which are 
possible for it before and after applying the selection 
rules. I have assumed that the program knows that RO is the 
support surface and that the circled numbers In each region 
indicate the relative brightness (the higher a number, the 
brighter the region). Notice that one junction, the peak on 
the scene/background boundary, can be uniquely labeled using 
only selection rules. Most of the interior junctions remain 
highl y ambiguous. 
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Moreover/ I am aware of ambiguity in another way; even 
though my own interpretation may carry a sense of conviction 
with it/ and even though I don't usually change this 
interpretation without reason/ I can easily understand how 
another person could interpret a scene in one way while at 
the same time I am seeing it in a different way/ where 1 am 
using seeing to mean interpreting with conviction of truth. 

I do not believe it is worthwhile to delve too much 
deeper into speculation about similarities between my system 
and human perception. For example/ it doesn't seem to me to 
make much sense to try and decide whether people generate 
alternative interpretations when they are needed or whether 
(as in my program) they keep all the active alternative 
interpretations but are only aware of the expected one at any 
given time. 

Nonetheless/ I think that in connection with ambiguity/ 
the notion of knowledge at "other levels" as the ability to 
eliminate interpretations/ and the notion of expectation as 
the default choice of an interpretation when I run out of 
solid knowledge/ are ideas of central importance. 
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6.0 THE MAIN LABELING PROGRAM 

You will recall that I described at some length In 
Section 2.i» a "filter program" which systematically removes 
junction labels whenever there are no possible matches for 
the labels at adjacent junctions. Now that I have shown a 
good deal more about the junction labels and the use of the 
selection rules, I would like to treat this program again 
from a somewhat different perspective. 

6.1 A SMALL EXAMPLE 

Suppose that the program Is working on a scene, a 
portion of which Is shown in figure 6.1. Assume that the 
selection rules eliminate all labels for each type of 
junction except those shown at the bottom of the figure. 
Remember that the selection rules operate only local ly / I.e. 
they give the same list of possibilities no matter how the 
labeling has proceeded or in what order the junctions are 
taken. All the step numbers refer to figure 6.2, which 
summarizes the succesive lists attached to each junction: 
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Step k: Now the program uses this same reasoning in the 
opposite direction. In what ways, If any, does the fact that 
Jl must be labeled from the list restrict the labels of 
adjacent junctions? Only J2 of the adjacent junctions has 
been labeled so far, so only J2 can be affected. The only 
labels which are possible for J2 are those elements of L2 
which have as a third letter "A" or "B M or "C" or "F". 
Therefore, the program eliminates M (F A D) H as a possible 
label and L2 becomes 

((A B B) (A B C) (B C A) (D B F)). 

Can the program eliminate any other labels because "(F A 
D)" has been eliminated? No, since no other neighbors of J2 
except Jl have been labeled, and the reason "(F A D) M was 
eliminated was because it had no counterpart at Jl. 

Step 5: The program now can move on to J3 and label It 
with L3. 

Step 6: Each label for J3 must have a third letter 
equal to one of the first letters from a label In L2. These 
letter are "A", "B" and "D". Therefore the program 
eliminates "(G H I )", H (F B O", "(D B F)", "(A B E) M and "(D 
C G) M from L3 and sets L3 to ((A B A) (B C A)). 
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Step 1: Suppose that the program starts with J2, and 
that all of the other junctions are unlabeled. Then the 
program assigns list L2 to J2, and since all the other 
junctions are unlabeled, It has no basis on which to 
eliminate any of the labels in L2. As far as the program 
knows, all of these label ings are still possible. 

Step 2: Now suppose that it next labels Jl by attaching 
to It the list LI. When it checks the junctions adjacent to 
Jl it now can see that J2 has already been labeled. 

Step 3: Therefore the program looks at J2 to find what 
restrictions, if any, have already been placed on line 
segment L-J1-J2. In this case, the restrictions are that 
L-J1-J2 must be labeled with either "B" or M C M or "A" or D" 
or "F", i.e. with any letter which appears third In an 
element of L2. Each element of LI which does not have "B", 
"C", "A", »D», or "F» as its first letter can then be 
eliminated. Therefore the program drops "(G H) M , "(E A)" and 
"<E B) M as possibilities and LI becomes 

((A B) (A C) (A D) (B B) (B E) (C F) (F A)). 
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(1) attaching labels, 

(2) removing any of these labels which are impossible 
given the current context of this junction, and 

(3) Iteratively removing label ings from the context by 
allowing the new restrictions embodied in the list of labels 
for the junction to propagate outward from the junction until 
no more changes in the context can be made. 

There are two points of importance: 

(1) The solution the program finds Is the same no 
matter where it begins in the scene, and 

(2) the program is guaranteed to be finished after one 
pass through the junctions, where it performs the three 
actions listed above at each junction. 

Given a line drawing with N junctions, a data base which 
has no more than M possible label ings for any junction, and a 
situation where any number of junctions from to N have 
already been labeled, let condition C be one where for each 
possible line label which can be assigned to a line segment 
ei ther 

(1) there is at least one matching line label assigned 
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Step 7: What labels now are possible for J2? Since the 
only remaining labels for J3 both set L-J2-J3 to "A", the 
program eliminates M (B C A)" and M (D B F)" from L2 so that L2 
becomes ((ABB) (ABC)). 

Step 8: This time, a neighbor of J2, namely Jl, has 
been labeled already, so the program must check to see 
whether eliminating the element of L2 has placed further 
restrictions on LI. Only elements of LI which have a first 
letter "B" or "C" are possible labels now, so the program 
eliminates "(A B)", "(A C) M , "(A D)", and "(F A)". LI thus 
becomes ((B B) (B E) (C F)). 

Since no other neighbors of Jl are labeled, the effects 
of this change cannnot propagate any further. 

6.2 DISCUSSION 

I think It is easiest to view the process of the program 
at each junction as having three actions: 
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possible that the set of labels for J can be reduced further 
because neighbor J2 has no match for one or more labels still 
attached to J. The program would then have to go back to 
line L-J-Jl again to see whether more labels could be 
eliminated from Jl. By considering the effects of each of 
J's neighbors on J's labels first, the program guarantees 
that as many labels as possible have been eliminated from J's 
label list before using this list to recompute the lists for 
J's neighbors. 

Condition C can now only be untrue along line segments 
joining J with its neighbors and, moreover, can only be 
untrue in one direction, i.e. J's neighbors may have 
unmatched labels, but not vice-versa. When the program 
eliminates the unmatched labels from each of J's neighbors, C 
is now satisfied on each line segment joining J to Its 
neighbors and C can only be unsatisfied along the line 
segments joining J's neighbors with the neighbors of J's 
neighbors, and again only In an "outward" direction, I.e. the 
junctions two line segments away from J can have unmatched 
labels / but all those junctions one line segment away (J's 
neighbors) cannot have unmatched labels. 
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to the junction at the other end of this line segment, or 
else 

(2) the junction at the other end of the line segment 
has not been labeled. 

This condition C must be satisfied before the program 
moves on to a new junction; the program keeps track of the 
line segments on which the condition may not be satisfied. 

When the program begins labeling a junction J, assume 
that C holds throughout the line drawing. When the junction, 
previously unlabeled, has labels added, the only line 
segments along which C can be violated are the line segments 
which join J to Its neighbors, and It is possible for C to be 
unsatisfied in both directions on these segments (I.e. both J 
and J»s neighbors may have unmatched line labels). 
Therefore, to make sure that the program needs to consider 
each line segment a minimum number of times, the program 
first uses the lists of possible labels specified by J»s 
neighbors to eliminate all impossible labels from J. 

To see why this Is the correct way to proceed, suppose 
that the program used J's initial set of labels to eliminate 
some labels from one of J's neighbors, Jl. it is then 
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The line segments on which C does not hold continue to 
spread outward to the neighbors of junctions two segments 
away from J / then junctions three segments away from J, etc., 
but only as long as labels are being removed from any 
junctions. As soon as the program reaches a step where no 
labels are removed from any junction, then the program knows 
that condition C must be satisfied everywhere In the scene, 
and It can move on to the next unlabeled junction. 

Figure 6.3 traces a situation which could occur on 
successive steps in a line drawing where all junctions except 
J have been labeled already. I have filled in the line 
segments along which condition C could be violated at each 
stage of the program's iterations. The mark ">" indicates 
which junction can have unmatched labels; It Is used like 
the same sign meaning "greater than", so that you can read 

J *. > J k 

as "the number of labels at Ji is greater than the number of 
labels at Jk", I.e. Ji may have labels which are not matched 
by ones at Jk. 
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The violations of C can spread outward to eventually 
touch any line segment of a line drawing, but only if the 
number of labels can be reduced at each junction on some path 
between the junction the program is currently labeling and 
the line segment. If any of the junctions In Figure 6.3 were 
unlabeled or if a unique label had already been found for the 
junction, then no violations of C could propagate through 
that junction. 

Figure 6.i» represents just such a situation. The line 
drawing is assumed to be completely labeled except for 
junction J, but this time Jl already has been uniquely 
labeled. Thus it can never be the case that Jl has unmatched 
labels. Notice that Figure 6.U also represents equally well 
the case where Jl has not yet been labeled. 

One final point: the process Is guaranteed to terminate, 
since if there are N junctions and no more than M labels 
possible for any one junction, the process can never go on 
for more than M x N steps at the very worst. This is 
Important since the restrictions can propagate back to the 
junction which Initiated the process. To see that the 
possibility of cycles does not create any difficulties, 
consider the following trick. Suppose that as soon as the 
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(2) next label all junctions which bound regions that 
share an edge or junction with the background. 

To see why the program is faster when it eliminates as 
many possibilities as early as it can/ I must first give some 
idea about the amounts of computation needed for various 
phases of the program. The basic operation involves removing 
unmatched labels from junction lists. The removal is done in 
the following manner: 

Assume that the junction whose list of labels must be 
reduced is called J2/ that its neighbor is Jl, and that for 
any label In the lists of either Jl or J2/ the first line 
label represents the line joining them. Thus If (A B C) Is 
one possible junction labeling in Jl's list/ then "A" Is the 
line label that this junction labeling would assign to line 
L-J1-J2/ and similarly/ If (D E F) Is a labeling from J2's 
list/ the "D" Is the line label which refers to L-J2-J1. 

Since J2's list Is the one to be reduced/ first look at 
Jl's label list and make a list which consists of the labels 
which Jl can apply to L-J2-J1. Notice that I have up to now 
glossed over the fact that for most lines/ the label appears 
different depending on which end of the line we choose as a 
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starting junction has been checked against each of its 
neighbors/ that all the remaining labels are removed from It. 
The restrictions can then spread outward only until no more 
changes can be made; now look at the process as though the 
junction were being labeled for the first time with the set 
of junctions just removed as its starting junction set. This 
process can then be repeated as often as necessary, but the 
number of times can never be greater than the Initial number 
of label ings assigned to the junction, since the process 
terminates If no more labels can be removed from the list of 
possibi lities. 

6.3 CONTROL STRUCTURE 

While the program can start at any junction and still 
arrive at the same solution, the amount of time required to 
understand a scene does depend on the order in which the 
junctions are labeled. The basic heuristic for speeding up 
the program Is to eliminate as many possibilities as early as 
possible. Two techniques which help accomplish this end are 
to 

(1) label all the junctions on the scene/background 
boundary first, since these have many fewer Interpretations 
that interior junctions do, and 
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reference point. Thus If line L-J1-J2 Is labeled 

I 
Jl . j, ,j£ 

S? 

then from Jl's end it appears to be labeled as "OCR-ISP" and 

from J2's end it appears to be labeled as "OCL-SPI" (for 
OClude Right-I 1 1 uminated/Shadow-Projected and OCLude 
Left-Shadow-Projected/ I 1 lumi nated respectively). Therefore 
what we really want is the list of the opposi tes of the first 
elements of each label for Jl. Suppose that I am given the 
scene portion shown in figure 6.5. If Jl's list of label ings 
is: 



((OCR-I I PLUS-I I OCL-I I ) 
(OCR-ISP PLUS-SPI OCL-I I) 
(OCRM-I I PLUS-I I OCLM-I I) 
(SHCLR-ISP OCR-SPI OCLM-II)) 



Then the list that I need to compare J2's labels to Is: 



LI « ((opposite (OCR-I I)) 

(opposite (OCR-ISP)) 

(opposite (OCRM-I I)) 

(opposite (SHCLR-ISP))) 

- (OCL-I I 
OCL-SPI 
0CLJ4-I I 
SHCCR-SPI ) 
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branches to see If any labels for adjacent junctions can be 
removed. Thus It must compute m lists analogous to LI above 
and each of these lists has n members. Now when each of 
these lists Is compared to the label lists for adjacent 
junctions / the program must make an average of n/ 2 tests for 
equality for each labeling that Is retained, and n tests for 
equality for each labeling that Is removed (for the case 
where It looks through an entire list and finds no match). 
Therefore for each portion of the process the amount of 
computation Involved Is at least proportional to n. 

Because the amount of computation is at least 
proportional to n, It Is undesirable to label Interior 
junctions first, since most of these have much larger Initial 
values for n than do scene/background boundary junctions. 
Not only does it take more computation to propagate any 
reductions through these junctions, but each reduction is 
likely to be smaller as well; If two adjacent junctions can 
each be labeled In n ways out of a total of N theoretically 
possible ways, then the expected number of label ings they 
have in common is n 2 /N. (This number is obtained by summing 
the probability of a match for each of the n labels at one 
junction; thus 
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J2's label list can then be compared to LI; the 

condition which must be satisfied by a labeling of J2 in 

order for it to be a possible labeling is that the line label 

it would assign to L-Jl-21 be a member of the list LI. 

Continuing with this example, suppose that J2's labeling list 
i s: 

L2 » ((OCL-I I OCR-I I OCR-I I) 

(OCL-ISP OCR-SPI OCR-I I) 
(OCL-ISS OCR-SSI OCR-I I) 
(OCL-SPI OCR-I I OCR-ISP) 
(OCL-SPSP OCR-SPI OCR-ISP) 
(OCL-SPSS OCR-I I OCRM-II) 
(OCL-I I OCR-II OCRM-I I) 
(OCRM-II OCRCR-II OCML-ID) 

Then the labeling list for J2 after comparing L2 to LI is: 

L2'» ((OCL-I I OCR-I I OCR-I I ) 

(OCL-SPI OCR-I I OCR-ISP) 
(OCL-I I OCR-I I OCRM-I I)) 

Now I return to the original claim, that It is desirable 
to remove as many labels as possible as early as possible. 
Suppose a junction J has m+1 branches and n+q labels, and In 
the process of labeling, q of these labels are eliminated by 
a propagating reduction which comes In on one of J's 
branches: this requires the program to compare n+q labels for 
members in a list. The program now has to check each of J's 
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I included this section so that an interested reader 
could get a better feeling for the operation of the program 
and also to suggest some ideas for extensions of this 
program. For example, if my labeling program were connected 
to a line-finding program such as Shirai's, my program could 
be adapted to provide intelligent guidance for deciding where 
to look next in a scene on the basis of which features had 
already been found (Shirai 1972). 

Another idea which might be interesting to follow up is 
a possible parallel between the reasons why It Is better for 
my program to start on the scene/background boundary and the 
observed fact that people presented with a figure on a 
background for short periods of time see detail first on the 
figure/ground boundary and require longer viewing durations 
to see details in the figure Interior suggesting that our 
perception proceeds from the outside inward (Koffka 1935). 

I mentioned at the beginning of this paper that the 
amount of time (and therefore computation) Is roughly 
proportional to the number of line segments In a scene. This 
may not seem to fit with the obvious fact that there is 
really nothing to prevent the effects caused by labeling a 
single junction to propagate to every portion of a line 
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^>"(n/N) « n x (n/N) » n 2 /N. ) 

Typically scene/background boundary junctions have about 
1/10 the number of possible labels an interior junction can 
have, so that the expected number of label ings to 
scene/background junctions will have in common is only \% of 
the expected number for two interior junctions. Similarly, 
it is worthwhile to label next interior junctions which are 
connected to junctions on the scene/background boundary / 
since the expected number of label ings In common for these 
pairs is only 10% of the number for interior junctions. 
Finally, as I mentioned earlier, it is worthwhile to label 
all the junctions surrounding regions which touch the 
scene/background boundary, since these regions contain all 
the '•best" kinds of junctions, and because a chain of 
junctions which closes on Itself tends to be far more 
restricted in Its possibilities than a chain of the same 
length which does not. (I will not attempt to prove that 
this is so; I think it is fairly obvious that the effect Is 
true, although the proof of the effect Is not. it is much 
more obvious for a tree search procedure than for this one.) 
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6.4 PROGRAM PERFORMANCE 

The program portions I have now described are adequate 
for labeling scenes without accidental alignments, 
non-trihedral vertices or missing lines. Within this range 
there are still certain types of features which confuse the 
program, but before showing its limits, I will show some of 
its complete successes. In all the scenes that follow, I 
assume that the program knows which region Is the background 
region, and that it also knows the relative brightness of 
various regions. The program operates nearly as well without 
these facts but not as rapidly. Figure 6.6 shows a number of 
scenes for which the program produces unique label Ings or Is 
only confused about the Illumination type of one or two 
regions (as In figure 6.6D and 6.61). By varying someof the 
region brightness values or omitting them, the program could 
also be similarly confused in this way for the tops of 
objects in figures 6.6A, 6.6B, 6.6E, 6.6G and 6.6H. In 
general, the program is not particularly good at finding the 
illumination types for regions unless the regions are bounded 
by concave edges. This confusion has a physical basis as 
well. In all the diagrams I have drawn these top surfaces as 
though they were parallel to the table so that the should all 
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drawing. 

There are good physical reasons why this seldom happens. 
The basic reason is that some junctions simply do not 
propagate effects to all their neighbors/ and so the effects 
tend to die out before getting too far. The prime type of 
junction which stifles the spreading effects Is the T 
junct ion. 

In most T junctions, the label ings of the upright and 
crossbar portions are Independent. Even If we know the exact 
labeling of the crossbar portion we are unlikely to be able 
to draw any conclusions about the labeling of the upright and 
vice-versa. Since objects are most commonly separated by T 
junctions/ the effects of labeling a junction are for the 
most part limited to the object of which the junction is a 
part, and to the object's shadow edges/ If any. 

Another reason why effects do not propagate far Is that 
when junctions are unlabeled or when they are uniquely 
labeled/ they do not propagate effects at all. (This reason 
was illustrated in figure 6.4.) Thus when few junctions are 
labeled and when most of the junctions are labeled the 
effects of adding restrictions tends to be localized. 
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be labeled as type I (Illuminated), but since the program I 
have described so far uses only the topology of a line 
drawing, it has no way of distinguishing the scenes I have 
drawn from others which should be labeled differently. For 
example, In figure 6.7 I have redrawn figures 6.6A and 6.6B 
so that the top surfaces are type SS (Self-Shadowed), but the 
figures are topologi cal 1 y identical. 

To decide whether a surface is sel f -shadowed or 
illuminated, one must be able to associate shadow corners 
with the vertices which cause them. In figure 6.7B, If C Is 
caused by B, then the top of the block is illuminated, and If 
C is caused by A then the top of the block is self-shadowed. 
To verify that A causes C, place a straight edge on the 
figure. There is an interesting optical illusion In this 
figure; it appears to me that the top surface of the block In 
figure 6.8 should be type SS, but in fact If you use the 
straight-edge test I described, you will find that it Is 
actually Illuminated. (I did not put In any shading, to 
prevent biasing the choice.) 

In any case, I think that the issue here Is not serious, 
since the program still finds the correct edge labels for all 
edges. In general I doubt that anyone will be too interested 
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In finding the Illumination values exactly; In the program 
they serve primarily as labeling aids, not as ends in 
themselves. However, before going on to something else / I 
would like to use this topic to illustrate a situation I have 
encountered several times in the process of performing this 
research. I noticed early In my study of scenes that If all 
shadow corners and their causing vertices In a given scene 
are connected by straight lines, these lines have roughly the 
same slope throughout the scene, provided that the light 
source is reasonably far away from the scene compared to the 
scene size. I thought that this fact might aid me a great 
deal in finding shadows. What I did not see was that until I 
could locate shadows and their causing vertices, I couldn't 
connect the two to find the characteristic slope; but If I 
could find the shadows and vertices, then I knew how to solve 
the problem already, and so I would not need to find this 
slope at all! There is at least one type of case where this 
slope is important, as I describe in the next section, but 
for the most part the topology of scenes provides adequate 
clues for finding shadows. 
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In figure 6.10A, each of the segments marked with a star 
can be interpreted either as an obscuring edge or as a 
concave edge / though in most cases choosing one or the other 
for some line segment forces other segments to be Interpreted 
uniquely, as shown in figures 6.10B and 6. IOC. 

As In the previous section, there are scenes which are 
topological 1 y Identical which can help to show why the 
program finds all these label ings as reasonable 
interpretations. Figure 6.11 shows five scenes which are 
topological ly identical to the labeled scenes shown In Figure 
6. IOC; In each of these scenes, the labeling shown seems to 
me to be the most reasonable one or at least a plausible one. 

Figure 6.12 shows the next problem case. Such a case 
occurs when we can see only enough of an object so that It Is 
not possible to tell whether the region Is a shadow or an 
object. If it happens that the ambiguous region is brighter 
than the background (or what would be the illuminated portion 
of a partly shadowed surface of the feature occurs on the 
interior of a scene), then the program can eliminate the 
possibility that the region Is a shadow. Unfortunately, If 
the ambiguous region is darker than Its neighbor, It cannot 
tell whether the region Is a shadow region or a dark object. 
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In figure 6.12, do you think that both A and B should really 
be labeled as shadow regions? In fact neither A nor B can be 
shadows! You can prove this for yourself by finding the 
characteristic light source slope for the scene/ using the 
front object and Its shadow. Then note that there can be no 
hidden objects which could project A or B. Figure 6.13 shows 
this construction. It is this type of distinction for which 
the light source slope information could be useful. 

I will not go through the process again of showing how 
each of the label Ings could arise. Clearly the 
interpretation of A and B as shadows is reasonable for this 
scene # since I can easily find a topological 1 y equivalent 
line drawing where some obscured objects could cause the 
shadows. The program needs to know about gravity/ support 
and line segment directions in order to eliminate some of the 
interpretations of region A. Every one of the 
interpretations is possible for 3. 

A closely related ambiguity is illustrated in figure 
6.1UA. Again difficulties arise because a shadow-causing 
junction is hidden. The fact that the program does not know 
at this point about gravity can be visualized as meaning that 
the objects which form both sides of a crack can appear 
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anywhere/ just as if the two objects were glued together. 
Figure 6.1UB shows such a case. 

The next type of problem involves support directly. An 
example of this type of difficulty is shown in figure 6.15. 
As In figure 6.10/ each of the edges which is ambiguous Is 
marked with a star (.it ) in figure 6.15A/ and the possible 
label ingS/ both "reasonable" and "unreasonable" are shown in 
figures 6.15B and 6.15C respectively. I have redrawn figure 
6.15C in figure 6.16 to show scenes with the same topology 
which have what were previously unreasonable label ings as 
their reasonable ones. Actually in some of the cases I have 
had to change the topology slightly. This happened because I 
wanted to construct an example which contained shadows and 
which exhibited all the ambiguities I show in figure 6.15; 
while I was not able to easily find a scene which satisfied 
these criteria and also did not require changes in topology/ 
there probably are such scenes. I do not believe that any 
general rules can be derived from the needed modifications. 

One final type of ambiguity is interesting and also 
serves to emphasize one of the findings of the work reported 
in this chapter. In figure 6.17 I show the two types of 
interpretations my program returns for holes. One of these 
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interpretations is the one I expected; I was surprised that 
the hole was amb!guous / but even more surprised to find that 
I had missed an obvious alternate interpretation of the same 
geometry. The alternate interpretation shown In figure 6.17B 
does not even need to be drawn with different line segment 
directions in order to appear reasonable. 

The label Ings which the program finds must be made up of 
local features / each one of which Is physically possible/ but 
it is not obvious that the features which remain should each 
be part of a total labeling of the scene which Is physically 
possible. After all, the only conditions I Impose are that 
each of these features must agree with at least one other 
feature at each neighboring junction. On the basis of the 
fact that the main labeling program does not leave extraneous 
labels on junctions, it seems clear that topology provides a 
major portion of the cues necessary to understand a scene. 

In the next chapter I show some heuristic rules which 
can be used to eliminate some of the label ings which people 
usually consider unlikely. In fact the true case is that 
these label ings are not unlikely, but the scenes which have 
these label ings as reasonable ones (to our eyes) do not often 
arise in our experience. Unfortuntely, heuristics sometimes 
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reject real I nterpretat ?ons / and indeed would reject each of 
the interpretations shown in figures 6.11 and 6.16 in favor 
of the ones in figures 6.10B and 6.15B. Nonetheless / in the 
absence of solid rules, these heuristics can be useful. In 
the chapter on region orientations I deal with the types of 
techniques which would enable a program to find the label ings 
which we would assign to these line drawings without resort 
to heuristics. 



SECTION 7.0 201 
7.0 NON-TRIHEDRAL VERTICES & RELATED PROBLEMS 

So far I have assumed that all the junctions I am given 
are normal trihedral junctions and essentially that the line 
drawing which I am given is "perfect". When a program has to 
be able to accept data from real line finders and from 
arbitrarily arranged scenes / these criteria are rather 
unreal i st ic. 

In this chapter, I show how to correct some of these 
problems in a passive manner. By passive I mean that the 
program is unable to ask a line finding program to look more 
carefully or to use alternative predicates at a suspicious 
junction, and similarly that It cannot move Its eye or 
camera, or direct a hand to rearrange part of a scene in 
order to resolve ambiguities (Gaschnig 1971). 

Instead I handle these types of problems by Including 
labels for a number of the most common of these junctions in 
the regular data base. In cases where the program confuses 
these junction label ings with the regular label ings and where 
I want a single parsing, I can easily remove these new types 
of junction labels first, since I have included special 
markers for each labeling of this type. Moreover, depending 
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on the reliability of the program which generates the line 
drawing, I may wish to remove labels in different orders. 
For example, if a line finding program rarely misses edges, 
missing edge interpretations can be removed first; if a line 
finding program tends to miss short line segments, then 
accidental alignments are probably being generated by the 
program, and these interpretations can be retained until 
last. Therefore the labels for each type of problem are 
marked with different indicators in the data base. 

7.1 NON-TRIHEDRAL VERTICES 

Some non-trihedral vertices must be included in the data 
base; indeed some are much more common than many of the 
trihedral vertices. I will limit the number by Including 
only those non-trihedral vertices which can be formed by 
convex trihedral objects. 

The first type of vertex is formed by the alignment of a 
vertex with a convex edge as shown in figure 7.1 and in 
figure 7.2. In figure 7.3 a similar set of junctions Is 
shown for objects which MARRY (I.e. have coplanar faces 
separated by a crack edge; see Winston 1970) along one edge, 
but which have difference face angles. 
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Figure 7.k Illustrates another common non-trihedral 
vertex which results again from objects with dissimilar face 
angles. This time I need a new type of edge, a separable 
convex edge, labeled as shown In figure 7.k. 

Figure 7.5 Illustrates the types of non-trihedral 
vertices which can occur when one block leans on another. In 
order to keep these cases from being confused with other 
trihedral junctions, I have introduced three new edge types. 
These types only can occur In a very limited number cf 
contexts. Figure 7.6 shows some of the ways In which these 
edges can appear. 

In the data base each of the label ings shown in figures 
7.1, 7.2, 7.3, 7.U, and 7.5, and any other junction labels 
involving the leaning edges or the separable convex edges, 
are marked as non-trihedral. Later, if I wish to find a 
single parsing for a scene where there are still ambiguous 
labels, removing these non-trihedral junctions, if possible, 
may be a good heuristic. 
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7.2 ACCIDENTAL ALIGNMENTS; FIRST TYPE 

In this section I have not attempted to exhaustively 
list every possible junction labeling which results from 
accidental alignment/ but have concentrated on Including only 
the most common cases. There Is some justification for this, 
in that ambiguities caused by accidental alignments can be 
resolved by simply moving with respect to the scene. 

Figure 7.7 lists all the junctions which can take part 
in the first type of accidental alignment I will consider. 
This type of alignment occurs when a vertex is closer to the 
eye than an edge which appears to be but is not part of the 
vertex. Thus the set of vertices In figure 7.7 are exactly 
that subset of the scene/background boundary junctions 
(Appendix k) which contain only obscuring edges on the 
scene/background boundary. Figure 7.7 shows only those 
junctions which I include as sufficiently common. The rest 
are excluded because they involve unusual concave geometries 
like those found in SOMA cube pieces (SOMA cubes are 
three-dimensional puzzles manufactured by Parker Bros. Inc., 
Salem / Mass.) or because they involve three-object edges or 
because the resulting junction would have enough line 
segments to require a designation of "SPECIAL" or because the 
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junction would require the alignment of the eye with three 
points In space. 

There is no regular junction which could be confused 
with any of the ARROW or K junctions generated by the the 
alignment of the junctions shown in figure 7.7 with edges 
behind them. To see why this is so / consider figures 7.8 and 
7.9. Figure 7.8 gives names for the distinguishable region 
segments for each type of junction. Figure 7.9 shows all the 
K and ARROW junctions that can result from accidental 
alignment with each each of the junctions shown in figure 
7.7. Notice that the background region can only appear in 
segments ARR0W1, ARR0W2, Kl, K2 and K3 in these accidentally 
aligned cases, whereas for all trihedral ARROW and K 
junctions which can appear on the scene/background boundary, 
only segments ARROWO and KO (the segments of these junctions 
which are greater than 180 and equal to 180 degrees, 
respectively) can be part of the background. Of course for 
the junctions where no segments are distinguishable (e.g. 
FORKs) or where the junction appears on the interior of the 
scene, these accidental alignment cases cannot be directly 
distinguished from the regular cases. 
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At this writing, I have not included all these 
accidental alignment types in the program's data base, but I 
have included most of the scene/background boundary cases and 
a number of the interior cases. In general, I have assumed 
that no non-trihedral edges or three-object edges will be 
among those obscured since both the alignment itself and the 
edge types are relatively unlikely, so their coincidence at a 
single junction is extremely unlikely. 

7.3 ACCIDENTAL ALIGNMENT WITHOUT OBSCURING EDGES 

Figure 7.10 shows some alignments which have shown up 
frequently in scenes I have worked with. These junctions 
have occurred because (1) our line finding program misses 
short line segments (and therefore tends to include more 
lines than it should in a single junction), (2) our line 
finding program has a tolerance angle within which it will 
call edges collinear, so some edges are called col linear even 
when they are not, and (3) edges which lie In a plane 
parallel to the surface on which they cast shadows are 
parallel to the shadows they cast, so that alignments become 
particularly likely when we use bricks, cubes, and prisms. 
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Figure 7.11 shows some other types of accidental shadow 
edge alignment which our group's line finding program 
frequently yields; these junctions are relatively common 
because of the tendency of the program to miss short line 
segments, but each of these types of alignment can occur 
naturally as well. (For information on our line finding 
program see Horn 1971 and Shirai 1972.) 

7.4 ACCIDENTAL ALIGNMENTS; FINAL TYPE 

The worst type of accidental alignment, In terms of the 
number of new junctions it can introduce, occurs when an edge 
between the eye and a vertex appears to be part of the 
vertex. Fortunately, all of the types of junctions which 
these alignments introduce are either Ks, KAs or SPECIALS. 
To see why this is so, look at figure 7.12. All these 
label ings can be quite easily generated by a program which 
operates on the regular data base. Notice that for each 
obscured vertex labeling, there are three new label ings 
generated, since the near region can have any of the three 
I 1 lumlnat Ion values. 
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Also notice that any of these junctions which appear on 
the scene/background boundary can only be oriented with the 
background In a junction segment type Kl, K2, K3, KA1, KA2, 
KA3, or KAk (see figure 7.8). Therefore it is not difficult 
to recognize the cases where accidental alignments of this 
type occur on the scene/background boundary since none of the 
regular trihedral junctions can ever appear on the 
scene/background boundary in any of these orientations. (The 
background can only appear normally in segments of type KO of 
KAO.) 

The number of K junctions of this type which can occur 
is limited by the fact that two of the line segments (the 
collinear ones) must always be obscuring edges and so can be 
labeled in a total of 108 different ways (Including region 
illuminations); the other two line segments can each be 
labeled in 81 ways, so there can be no more than 81x81x108 » 
708,588 possible K labelings. In fact, as usual, there are 
not nearly this many labelings. To find the limit on the 
number of these junctions, use figure 7.12 and Table 5.3 
together, as shown in Table 7.1. The numbers In Table 7.1 
are obtained by taking the total number of Interior labelings 
for a type of junction (remember that this number includes 
TABLE labels as well), multiplying this number by the number 
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of ways in which it can form a K junction and multiplying 
this number by three (since the obscuring region can have 
three types of illumination, independent of what the other 
labels are). Thus, for example, there are 109 ARROW 
label ings, and each can be used two ways to make a K label of 
this type (see figure 7.12), so the total number of K 
junctions due to obscured ARROWs is 109x2x3 » 654. Each 
ARROW labeling can be used in only one way to form a KA 
junction, so the total number of these is 109x1x3 » 327. 

While I could include these label ings directly in the 
data base, their number is clearly unwieldy. In any event, I 
managed to find a way to include the label ings exactly but In 
a manner somewhat different than those I have been dealing 
with so far. In order to show this method, I first have to 
fill in some gaps I left earlier. 

7.5 MORE CONTROL STRUCTURE 

In this section I return again to the main labeling 
program and describe what happens when the program Is unable 
to label a scene consistently, using the set of labels with 
which it has been equipped. 
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The program Is written in Ml CRO-PLANNER / a programming 
language with automatic back-up facilities (Sussman et al 
1971). Before the program begins labeling a junction J, it 
saves the context of the junction (I.e. the labeling which 
existed before the program assigned any labels to J). As the 
program iteratively eliminates the labels which can now be 
removed because of the new constraints which J adds, it 
checks at each step to make sure that at least one label 
remains possible for each line segment. If this number ever 
goes to zero for any line segment, the program assumes that J 
is the source of the problem, i.e. that J needed a label that 
was not in the list assigned to it by the selection rules. 
When this happens, the program restores the context to the 
state that existed before It began labeling J, and It marks J 
to indicate that J cannot be labeled from the normal label 
lists. Once J has been marked in this manner, It appears to 
neighboring junctions to be just like a junction which has 
not been labeled yet, and therefore J imposes no conditions 
at all on the possible line labels for Its neighbors. The 
program can then continue and as long as two adjacent 
junctions are not left unlabeled at the end of the program's 
operation, every line segment can be assigned a value or set 
of values, just as If every junction had been labeled. 
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The problem with this arrangement is this: suppose that 
the program is given a line drawing which has one junction 
that cannot be labeled from the regular set of junction 
labelings. Clearly If the program labels this junction last, 
it will be unable to label the junction and will give the 
correct result. However, if this junction Is labeled before 
any of its neighbors, then it is, of course, automatically 
assigned labels from the normal set, for none of the 
surrounding junctions impose any constraints on it. In this 
case, one or more perfectly normal junctions In the scene 
will eventually be marked as unlabelable, and the resulting 
total labeling for the scene will be invalid. In general, if 
the bad junction is labeled toward the end of the program's 
operation, then the total scene labeling Is correct, and if 
the junction is labeled early in the program's operation, the 
total scene labeling is incorrect. 

My first attempt at solving this problem was to label 
all Ks and KAs last. In many cases the Ks and KAs were then 
indeed correctly identified as unlabelable from the normal 
set. However, I managed to come up with a much neater 
solution which enables the program to generate labels for 
these otherwise unlabelable junctions. 
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As before^ I have the program label all Ks and KAs last/ 
but this time I modified the labeling procedure. If a 
junction cannot be labeled from the normal set/ Instead of 
marking it unlabelable I generate possible label ings by 
modifying the line drawing so that it contains equivalent 
junctions which are not accidentally al Igned, and then I 
label these junctions in the normal manner. Thus/ as shown 
in figure 7.13/ if the normal set of junctions Is inadequate 
to label a K / the most reasonable alternative Is that the 
junction is actually an obscured L vertex. Therefore I 
change the line drawing (saving the original of course) and 
try to label the new line drawing. This change is equivalent 
to moving the eye slightly to see what type of junction is 
obscured / except that since the program is unable to move its 
eye and therefore does not know what the real vertex type \s, 
it keeps trying various alternatives until one works/ or 
until it hits a default case. In the example shown/ the 
program finds a reasonable interpretation on the first try. 
If it had not/ then the program would next have tried to 
label the junction as an obscured ARROW / since ARROWS are the 
next most common type of junction after Ls. 
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Notice that the condition for a modification to be 
reasonable is not as simple as the old condition for a single 
junction as illustrated in figure 7.U. The condition for 
figure 7.1UA is that J, Jl, and JU must all be labelable. 
Before there was no condition joining Jl and J4; If they do 
not match now / it does not matter whether JO can be labeled 
or not because a total labeling would be Impossible. This 
means that the program has to be able to save the context 
until It has finished checking the labeling of several 
junctions, and that it should only finalize the modifications 
when it has proved that every portion of the new line drawing 
Is reasonable. To illustrate further, in figure 7.14B I show 
the modifications necessary to interpret JO as an obscured 
ARROW junction. These modifications create a new junction, 
and the two junctions, JO and JJO must both be checked; 
unless both can be labeled consistently this Interpretation 
is impossible. 

In fact, I can carry this idea even further. Suppose 
that a K junction, JO, is actually an accidental alignment, 
but that since other K and KA junctions In the lined-awing 
have not yet been labeled JO can be labeled from the normal 
set of label Ings. Later another K, which should be labelable 
from the normal set cannot be labeled, since the wrong choice 
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was made for JO. To eliminate this type of dlfficulty / I 
require all K and KA junctions to agree, and If they do not 
agree / the program can back up to any of the K and KA 
junctions until it has actually tried every combination of 
interpretations for the junctions. Thus the program should 
not finalize any of the labels for K or KA junctions until 
al 1 of them agree. 

This solution is still not guaranteed to contain the 
correct one; the program will be satisfied with the first 
set of modifications for the K and KA junctions which gives a 
complete labeling. To be certain of Including the correct 
solution the program would have to try every combination of 
Interpretations for every K and KA and save all the ones 
which give complete label ings. Eventually I hope to Include 
this ability when I modify the program to run In the CONNIVER 
language (McDermott & Sussman 1972); this language has 
better facilities for developing and saving parallel 
contexts / whereas MICRO-PLANNER does not. MICRO-PLANNER Is 
oriented toward a tree search model of problem solving where 
the branches of a solution tree are explored until a correct 
solution Is found. In my case/ the problem Is that thereray 
be more than one correct solution. 
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In any case, when I programmed this ability, I lumped a 
number of junction types together into a default case for two 
reasons: this lessened the possibility of stopping before 
getting the desired ("correct") solution, and itenabled the 
program to run much faster and required a much smaller 
program than would have been needed if I had Included 
separate machinery for each type of junction. The program 
tries the possibilities for a K in the following order: 

(1) try to label the K from the normal label lists. 

(2) try to label the K as an obscured L vertex. 

(3) try to label the K as an obscured ARROW vertex. 
U) If all these fail, label the K as two T junctions 
(see figure 7.15). 

The default condition represents the exact opposite of 
the previous conditions. The two Ts result If Instead of 
moving the eye (by imagination) to see what vertex Is behind 
the obscuring edge, the program moves its eye (by 
imagination) to completely cover the vertex and eliminate the 
accidental alignment. Notice that the default condition 
gives much weaker constraints than could be obtained by 
trying all the rest of the junction types explicitly. The 
only relation that must hold for the two T uprights is that 
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the region between them (marked R In figure 7.15) have an 
Illumination value which matches both uprights. Nonetheless 
this Is a much stronger condition than Is Imposed by leaving 
the junction totally unlabeled and / In addition / the 
colllnear segments (L-A-B, L-B-C, L-C-D In figure 7.15) can 
all be labeled unambiguously as occluding edges. The 
Information I throw away requires that the two uprights be 
adjacent segments of the same vertex / where this vertex can 
presumably be labeled from the normal label lists. 

7.6 MISSING EDGES 

Missing edges usually occur when the brightness of 
adjacent regions is nearly the same, since most line finding 
programs depend heavily on steps in brightness to define 
edges. I have made no attempt to treat missing edges 
systematically, but have only Included a few of the most 
common cases In the data base. Clearly missing edge junction 
labels could be systematically generated by a program merely 
by listing all possibilities for eliminating one edge from 
each junction label. This procedure would generate 
(n-l)x(old number of regular labels) for each junction type 
(where n is the number of line segments which make up the 
junction), and clearly this would be a rather unmanageable 
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number of new labels. The number of new labels could be 
lessened somewhat by noting that certain types of edges such 
as cracks are likely to be missed whereas certain other edges 
such as shadows are relatively unlikely to be missed. 

Even if a program such as mine can recognize that a 
junction must be labeled as having a missing edge, problems 
still remain about exactly how the line drawing should be 
completed. This difficulty is illustrated in figure 7.16. 
Depending on the line segment directions and lengths, the 
missing edge junction D can be connected to vertex A, vertex 
B or vertex C, even though the topology of all the line 
drawings is identical. 

The missing edge junctions which are Included in the 
program's data base are all L junctions which result from 
deleting one of the branches of a FORK junction with three 
convex edges. Incidentally, in the examples shown in figure 
7.16, my program finds each of the given interpretations, but 
finds no other interpretations, i.e. it finds no 
interpretations which do not Involve missing edges. 
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A rule which can be helpful In removing impossible 
missing edge interpretations is that if a region Is bounded 
by only one junction which can be interpreted as having a 
missing edge in that region / then that missing edge 
interpretation Is Impossible. (There must be another 
junction to connect with the missing edge.) A similar rule 
depends on including the label that the missing edge would 
have had in each missing edge labeling. In this case/ the 
rule is that not only must there be a pair of missing edge 
junctions around a region in order for either of them to be 
possible/ but this pair must also match in the label that 
each gives to the missing edge. One final rule Is that the 
previous rules only hold If the pair of missing edge 
junctions are not adjacent to one another (I.e. each pair of 
junctions can be connected by only one straight line). 

If more than one edge Is missing/ then a program 
requires greater constructive understanding than my program 
has/ although I believe that there are reasonably simple 
rules which allow a program to solve scenes even If they are 
as bad as the one shown In figure 7.17. For example/ Shiral 
has demonstrated that the silhouette of a scene contains a 
great deal of Information about where interior lines and 
junctions can appear (Shiral 1972). Although he does not 
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consider scenes with shadows, I believe that the same 
principles which he uses are applicable for shadowed scenes. 
Freuder has also written a sophisticated heuri st ic p-ogram 
which fairly reliably fills in edges missed by our group's 
line finding programs (Freuder 1971a, 1971b). 

7.7 HEURISTICS 

As I have mentioned earlier in several places, the 
program is able to remove junction labels selectively 
according to a crude probability measure of the relative 
likelihood of various individual feature interpretations. 
These heuristics are a poor substitute for foolproof rules; 
in essence I view the heuristics as an expedient method for 
handling problems I have not yet been able to solve properly. 
As I explained in Section 5.4, these heuristics may 
nonetheless be of considerable value In guiding programs 
which find sound solutions. 

There Is not much to say about the heuristics 
themselves. The ones I am using currently lump all the 
"unlikely" junction labels into one class, the "likely" ones 
into another, and simply eliminate all the "unlikely" labels 
as long as there are "likely" alternatives. 
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However there are some interesting cases where I have 
found that I can usually eliminate the unwanted the problem 
scenes in Section 6.5. Obviously/ to solve these cases 
exactly would require a great deal more programming effort. 

Heuristic 1: Try to minimize the number of objects in a 
scene interpretation. 

Implementations: 

(1) Make shadow L junction labels (see figure 7.18A) 
more likely than any other type of L junction. 

(2) Make labels representing interior TABLE regions more 
likely than the equivalent labels that do not involve TABLE 
regions. 

(3) If regions can be Interpreted either as shadows or 
as objects/ make shadow interpretations more likely. 

Heuristic 2: Eliminate interpretations that have points 
of contact between objects or between objects and the TABLE 
unless there is solid evidence of contact. 
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Implementation: Make ARROW junction labels which have 
two concave edges and one convex edge (see figure 7.18B) less 
likely than ARROW labels of other types. 

These heuristics select interpretations (1)/ (2), and 
(7) from figure 6.10/ interpretations A(l) and B(2) from 
figure 6.12 / interpretation (1) from figure 6.1k, and 
interpretations (1), (2), (3), U), (5)/ and (9) from figure 
6.15. 
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8.0 REGION ORIENTATIONS 

What has obviously been missing from all that I have 
shown so far Is a connection between line segment directions 
on the retina and possible label ings for these lines. Such a 
connection Is extremely useful if the program Is to 
understand gravity and support. In this chapter I describe 
approaches to this problem which I have not yet Included In 
my program. There is probably as much work required to 
properly add the ability to handle direction Information as I 
have already invested in my program. Nonetheless, I believe 
that this chapter provides a good idea of the work that needs 
to be done as well as the physical knowledge that these 
additions will allow one to include in the program. 

8.1 LINE LABEL ADDITIONS 

To begin with / I investigate the partitioning of each 
edge type into three subtypes, a technique analogous to the 
ones I used earlier to divide concave edges Into four classes 
and all edges into types according to their region 
illumination values. As In the case of occl udi ng edges, the 
line values are only defined with respect to a reference 
point and direction, where the usual reference points are 
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junctions. The three values are: 

(1) U (Up) - an edge directed up, away from the TABLE. 
The reference end is closer to the TABLE than any other 
points along the edge in the reference direction. 

(2) D (Down) - directed downward toward the TABLE. This 
is the opposite of U, the same edge but with the other end of 
the line and the opposite direction on the line as 
references, 

(3) P (Parallel) - parallel to the TABLE or In the plane 
of the TABLE. 

Notice that there are some Immediate limitations that 
can now be set on the set of junction label ings: 

(1) Any shadow edge or concave edge marked with a "T", 
i.e. which Is In the plane of the table, automatically can 
have only one direction, P, In this partitioning. 

(2) Any junction which has one or more shadow and 
concave edges labeled »T» must have its edges of other types 
In the U direction, since the edges at such junctions must 
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either be in the plane of the TA3LE or above this plane. 

(3) Two edges which bound the same region and which are 
parallel or collinear must both have the same direction 
value, U, D, or P. This fact can be chained through several 
regions. 

Figure 8.1 illustrates these facts; U is Indicated by 
placing an arrow along the side of a line segment pointing in 
the Up direction. 

Notice that these rules also allow a program to find 
horizontal surfaces, an important part of the notion of 
support. A horizontal surface can be defined in this system 
of notation as any region bounded by two or more edges which 
are both marked P ( || ) and which are not parallel to each 
other or collinear. Moreover, any edges which are in the 
plane of a horizontal surface can then also be marked as 
parallel to the TABLE, regardless of the directions of these 
lines on the retina. Finally, any junctions which bear a 
relationship to a horizontal surface, analogous to the one 
that I mentioned earlier for junctions which had segments in 
the plane of the TABLE, can similarly have their other 
segments labeled U. Figure 8.2 illustrates these points. 
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These rules are not particularly helpful when there are 
no parallel edges; It is possible to chain some values in 
the absence of parallel edges and horizontal surfaces, but 
generally such chaining cannot be carried very far. 
Depending on the way that edges deviate from parallel, It is 
sometimes possible to assign an Up direction. (In some of 
the figures which follow I have not marked the lines with 
their normal labels, but have only included the direction 
labels for clarity.) See figure 8.3, and note that since 
edge L-A-B is not parallel to L-C-D, I can mark L-C-D with an 
Up direction as shown. This means that since L-E-F is 
parallel to L-C-D, It can also be marked with an Up 
di rect Ion. 

8.2 AN EXAMPLE 

Using the methods I have already discussed plus one 
other piece of new information, I can show how to eliminate 
some labe lings for a line drawing if I know the line segment 
directions. To see how these can help, consider again the 
example I showed in figures 5.10 and 5.11, as illustrated now 
in figure 8.UA. Because L-A-B Is parallel to L-C-D and L-B-E 
is parallel to L-D-F, Rl must be horizontal, assuming that 
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the labeling shown is true. By the same kind of reasoning R2 
must be horizontal. Now the additional rule Is: two 
horizontal regions can only be separated by crack / shadow or 
obscuring edges. Therefore/ the labeling shown Is 
impossible, since L-C-G is a concave edge, and consequently 
cannot separate two horizontal regions. Similarly I can 
eliminate the labeling shown in figure 8.i*B. 

8.3 GENERAL REGION ORIENTATIONS 

In this section I define a quantization scheme which 
assigns to each visible region one of sixteen values. The 
regions are named in as sensible and simple a manner as I 
could devise, and are defined with respect to a coordinate 
system which is itself defined by the TABLE surface and the 
position of the eye viewing the scene. The region 
orientation values are each shown In figure 8.5; I assume 
that this figure will serve as an adequate specification for 
the meaning of the different orientation values. If the 
scene is moved with respect to the eye or vice-versa, then 
the region values (except Table and Horizontal) may change, 
and regions previously invisible may become visible. Thus 
the region orientation values are not inherent properties of 
the surfaces, but are only defined with respect to a 



[ASSUNB THAT VPG&S WHICH ARE VERTICAL ON ?A&B 2S8 
PA?ER AKB VERTICAL psT TH"B SCENES/] 



©TABUS 



HORIZONTAL. 

(DFRV 
CFRoKr KiGHT 

VEHTK2A.IO 

®FLY 

fFfcOMT LEFT 
VERTrCAiO 




1 TA 
©FRTI 

CFRONT WGHT-UF) 

©BLT1 
CBACK1**BFTUP) 




1 TA 
3FEY 

(Z)FL"U N 

(DBRU 

(SACK KIGHT U?) 




Figure &S 



1TA. 
<3) HI Clbft up) 

©RU CKIGHrUP) 

A)TU Cfbjdntup) 

"BU (BACKUP) 



PAGE 259 




1 TA 
1 K 

3 TRV 

4 FLY 

TV 

(FRDWT VERTICAL) 




1 TA. 
«. H 

3 TRV 

4 FLY 

<g)FLP 

fFROMT LEFT 
3>e>WJt> 




1 TA 
2. H 

3 PRY 

4 FLV 
©TP (PFONT DOWN)" 




Fc&ure 8.5 



SECTION 8.3 260 
particular eye-table arrangement. 

If a region Rl Is type FRV (Front Right Vertical) and an 
edge separating this region from region R2 is a shadow edge/ 
then region R2 must also be type FRV (see figure 8.6A). The 
problem is not quite so simple when the other edge types are 
involved. To give the flavor of what I would like to be able 
to do in general/ note that if an edge separating Rl and R2 
is vertical on the retina / and Rl appears to the right of R2 
on the retina/ then R2 can only be type FLV or type FV or 
type FRV (see figure 8.6B). 

B.k GENERAL LINE DIRECTIONS 

Before I can carry out this type of association in 
general/ I must 

(1) define line directions on the retina and 

(2) define line directions In the scene domain with 
greater precision/ and 

(3) show how to find the scene direction values/ given 
a labeled line drawing and the retinal line directions. 
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Throughout this chapter I assume that the eye Is far 
enough away from the scene so that vertical edges in the 
scene project into North/South lines on the retina. Since 
the definition of North/South edges includes a tolerance 
angle €: , the eye does not need to be at Infinity for this 
condition to hold. By the same reasoning I assume that 
parallel edges can be recognized without resort to 
perspective or vanishing point considerations. 

First I define the retinal line directions In terms of 
compass points as shown in figure 8.7. 

Next , in figure 8.8, I define the names for the 
directions of lines in the scene by showing examples for each 
type possible direction. These names resemble the names for 
region orientat ion S/ but I will always use lower case letters 
in referring to the line names and will use upper case 
letters when I refer to the region names. 

Now to make the connections between the retinal and 
scene line directions, note that I can catalog all the 
possible edge directions in the scene domain which can map 
into each of the direction values on the retina. As an 
example of how to do this, in figure 8.9 I show all the edge 
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directions possible for an edge which bounds a type FRV 
region. The diagrams in this figure show that an NE 
(Northeast) line on the retina which bounds a type FRV region 
can be an edge of types bru / brp, or brd, that an E (east) 
line on the retina which bounds a type FRV region can only be 
caused by a type brd edge, etc. Table 8.1 Is a summary of 
the types of scene edges which can cause lines of each type 
on the retina, arranged according to the types of regions 
that each edge can bound. 

Now to tie everything together, notice that an edge can 
only separate two regions if the edge could have the same 
direction in both regions bounding the edge. Therefore, to 
find all the region pairs that an N (North) edge (as seen on 
the retina) could separate, look down the N column In Table 
8.1 and find all the pairs of regions which can share an edge 
which points in a particular direction. A north pointing 
edge can thus separate any of the following pairs of region 
types (this is not a complete list): 

((TA TA) (H H) (TA LU) (H LU) (H RU) 
(RU TA) (RU H) 

(FRV FRV) (FRV FLV) (FRV FV) 
(FLV FRV) (FLV FV) (FLV FLV) 
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(FV FV) (FV FLV) (FV FRV) 
(LU H) (LU RU) (LU LU ) 
(RU H) (RU LU) (RU RU) 
(BLU BRU) (BRU BLU)) 

Not all these pairs can be separated bythe same types 
of edges; shadows and cracks can only separate regions with 
the same orientation values, and convex edge pairs become 
concave edge pairs If the order of the pairs is reversed. 
For example, a North line separating regions with orientation 
values (FLV FRV) represents a convex edge (where the ordering 
of the regions is in a clockwise direction), but If the 
orientation values are (FRV FLV) for a North line, this must 
represent a concave edge. This fact is illustrated in figure 
8.10. 

If the Up/Down/Parallel designations are also included 
in the regular labeling program, then it Is possible to make 
even finer distinctions. Table 8.2 shows some of the lists 
of region orientation pairs which can be assigned to lines 
having the indicated labels and directions. 
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A program can use this information in the following 



ways: 



(1) If there are ambiguities remaining after the 
regular labeling program has finished, pick a single 
labeling, assign region values using the lists shown in part 
in Table 8.2/ and see whether this labeling can represent a 
possible interpretation; if the interpretation is not 
possible, then the program will be unable to assign 
orientation values to every region, very much like the case 
earlier in this chapter where a concave edge could not 
separate two horizontal surfaces. 



(2) Region illumination values can be tied in with th 
region orientation values. For example, if a scene is lit 
from the left, and the light-eye angle is less than 90 
degrees (see figure 8.11; the light-eye angle is the angle 
between the projections of the eye and the light onto the 
plane of the TABLE, as measured from the center of the 
scene), then a region cannot be labeled simultaneously as 
orientation type FLV and illumination type SS (Self- 
Shadowed) . 
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(3) All these facts provide a neat way to Integrate 
stereo information into a scene description. For example/ as 
shown in figure 8.12, if an edge is truly vertical (type vu) 
then it must appear as N (North) in any retinal projection of 
a stereo system. However an edge which is of type bp (back 
parallel) can appear to be N on the retina because of the 
particular placement of the eye with respect to the scene. 

If the eye is shifted slightly to the right, this edge will 
now appear to point NE (Northeast) and if the eye is shifted 
to the left, the edge will appear to point NW (Northwest). 
Clearly this knowledge would enable a program to much more 
severely restrict the region orientation pairs, and 
consequently the label ings, that can be assigned to a line 
drawing of a scene. Without the region and edge orientation 
formalisms (or other similar formalisms) it Is not possible 
for a program to understand this stereo Information, although 
one could undoubtedly find ad hoc ways of using the 
informat ion, 

(4) All the possibilities for region orientations can 
be generated by the function I called ILLUMINE In Section 
4.1. For each labeling which the program finds, ILLUMINE can 
select region pairs according to the line directions and line 
labels, and build up a set of region orientation values In 
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exactly the same manner that ILLUMINE builds up sets of 
region Illumination values. The difference Is that there are 
far too many region orientation values In general to possibly 
Include them In precompiled form; the values must be 
generated from the greatly reduced set of possibilities that 
remain after the regular labeling program has completed Its 
work. The reason why there are so many possibilities is that 
there are so many possible region orientations. Each edge 
can potentially have 16x16 * 256 region orientation pairs as 
opposed to the nine possible region illumination pairs. 

8.5 SUPPORT 

Using the region orientation values, I can now define 
the set of edges along which support must hold, the set of 
edges along which support can hold, and the set of edges 
along which support cannot hold. By support I mean what Is 
commonly termed either resting on or leaning on. 

To start with, I can eliminate from consideration any 
edges which are shadows, convex edges, obscuring edges, or 
concave edges made up of one object or of three objects, and 
I can say for certain that support Is exhibited along any 
concave edge which has the TABLE as a bounding region. In 
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addition edges labeled as "leaning" (see Section 7.1) point 
to places where support relations must hold / although support 
does not hold along the leaning edges themselves, since these 
are either obscuring or convex edges. 

The important fact is that these edges exhibit support 
regardless of their directions on the retina, so that there 
is no problem with edges such as L-A-B In figure 8.13. The 
best previous rules to find where support holds in a scene 
(see Winston 1970) are not able to handle cases like this; 
Winston's rules were biased toward finding ARROWS, Ks and Xs 
which have vertical (or at least upward pointing) lines as do 
all of the cases In figure 8.1U (this figure is a copy of 
figure 2-1*1 from Winston 1970). In addition, Winston's rules 
failed to find one support relation for the leaning block; 
his rules assumed that objects would be supported by face 
contact only. 

Although my program can find support In cases like 
figure 8.13, it is important to note that, in general, it is 
not possible to use my regular label ings and line directions 
alone to find which edges exhibit support and which do not. 
Suppose that on the basis of the frequency of crack edges 
like the ones shown in figure 8.15A I decided to label as 
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Supporting/crack edges ones In which the arrow of the crack 
label points SW, W, or NW, and to class all the others 
together as being crack edges without support relations. 
Then in figure 8.15B edges L-B-C and L-C-D would be correctly 
marked but L-A-B would not. I could patch up the rule by 
saying that if support holds for one non-col 1 1 near line in an 
X junction it must hold for the other non-col 1 inear line of 
the X as well. Unfortunately this rule causes the program to 
assert that support holds between the two objects in figure 
8.15C, since support would be transferred by the rule from 
L-B-C to L-A-B. 

Similarly / for concave edges I cannot use line 
directions and the direction of the arrow on the label to 
define support. As an example, observe that while L-A-B in 
figure 8.15D does not exhibit support/ L-C-D in figure 8.15E 
does. 

Region orientation values can help to avoid these 
problems, at least for some cases. (There are some, cases 
such as the one in figure 8.15F , where I do not know whether 
to say that support holds along L-A-B and L-B-C or not.) 
Interestingly enough, with region orientations specified, I 
do not necessarily need line directions, although I certainly 



SECTION 8.5 283 

need line directions to find the region orientation values to 
begi n wl th. 

An example of an edge where support must hold is any 
concave edge which has a horizontal surface on its left when 
one looks along the edge in the direction of its "arrow", as 
does L-C-D in figure 8.15E. 

Some examples of edges where support cannot hold are 
concave edges which have vertical surfaces (FRV, FV, or FLV) 
or downward pointing surfaces (FRD, FD, or FLD) on the left 
of the edges when looking along the direction of the "arrow"; 
line L-A-B in figure 8.15D Is an edge of this type. 

While I do not show how to do so here, I believe that 
the best way to add the understanding of support to the 
framework of my program is to: 

(1) add support labels to lines in junction label ings 
where support can hold, and add these label ings to the 
regular set of labels, 
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(2) as usual/ do nothing when a line represents 
unambiguously a support edge or an edge without support, and 

(3) when there is ambiguity, use the region orientation 
values to help decide the issue. To do this, note that since 
there is a connection between the edges which can have 
support, line directions, and region orientations, I can use 
the function ILLUMINE again to eliminate impossible 
combinations and hopeful y decide where support can and cannot 
hold. 

I have no great confidence that such a system will show 
where support must hold for certain, but the knowledge about 
where support can hold combined with the knowledge that every 
object must be supported somehow, should allow the program to 
do quite well. I suspect that the program will be quite good 
at finding places where support cannot possibly hold. To 
solve these problems fully a program needs considerable 
knowledge about stability, gravity, and friction. These 
problems are outside the scope of this paper; for a 
discussion of some of the issues involved see Blum et al 
1970. 
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To give a feeling for the number of new junctions which 
would be required in the data base, I have shown some 
junctions in figure 8.16 which can involve support. Figure 
8.16A illustrates the fact that any concave edge which 
touches the TABLE must be a support edge. In figure 8.16B, 
if the crossbar of the T (the col linear lines) exhibits 
support on one of its halves, then it must exhibit support on 
the other half as well and the support direction must be the 
same for both of these edges. Similarly, in figure 8.16C 
both non-coil Inear edges must have the same support or lack 
of support values. If each of the branches which can 
potentially exhibit support relations were labeled 
independently, then the cases in figures 8.16A and 8.16B 
would each have 27 possible support assignments Instead of 
three and nine respectively, and the case in figure 8.16C 
would have 9 assignments Instead of the actual three. Thus 
the same kinds of techniques which I have shown earlier for 
other descriptions would almost certainly work well for 
support cases too. Finally, obscuring edges, which have up 
to now accounted for the biggest increases In the numbers of 
new labels when the old labels were split Into subtypes do 
not even take part in this partitioning, so that the increase 
in the total number of label ings should be well within 
bounds. 
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Of course my present program can already list lines 
where support may hold (i.e. all crack and two-object concave 
edges) / and as before/ simple heuristics would allow the 
program to say with some confidence where support could or 
could not hold. Clearly/ it would also be quite natural to 
call some of the support assignments in figure 8.16 "likely" 
and certain others "extremely unlikely". 
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9.0 HISTORICAL PERSPECTIVE 

It Is Instructive to reexamine earlier vision work which 
dealt with similar problems in the light of the formalisms I 
have presented in this paper. In this chapter I review the 
work of Guzman (Guzman 1968), Rattner (Rattner 1970), Orban 
(Orban 1970), Freuder (Freuder 1971a, 1971b), Dowson (Dowson 
and Waltz 1971, Dowson 1971a, Dowson 1971b), Huffman (Huffman 
1971), and Clowes (Clowes 1971, Clowes et al 1971). 

In what follows I hope to give you some appreciation for 
the real advances in thinking about vision which were brought 
about by these authors. Ten years ago the whole area of 
computer vision was uncharted territory, and It was certainly 
far from obvious where one should begin. Today, while there 
are innumerable questions still unanswered, we have some 
definite ideas about how vision systems could be organized 
and about the reasons why many appealing systems such as 
perceptrons and template matching schemes are inadequate 
models for vision systems (Mlnsky & Papert 1970). 
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9.1 GUZMAN'S SEE PROGRAM 

Guzman's work is probably the most famous of the earlier 
vision work, and indeed his approach was a dramatic departure 
from what had been done before him. His formalisms were 
designed to group regions together into bodies. Basically 
his program did this by identifying each line in a line 
drawing as linking or not linking, where linking means that 
the regions on both sides of the line belong to the same body 
and not linking means that there is no evidence about the 
line; it may be either linking or the regions on either side 
of the line may belong to different bodies. Guzman used a 
set of junction types exactly as my program does (L, ARROW, 
T, etc.) but he included only one labeling for each type of 
junction. Guzman's junction set is shown in figure 9.1. 

There can be two conditions for any line In a line 
drawing after the label ings have been assigned to each 
junction: 

(1) the labels at either end of a line agree, In which 
case the labels are assumed to be correct, or 



t 



PAGE 290 



= LINK 



L: 



K: 




iftROW: 




X: 




TOBK: 



t 




XX'. 




T: 



TEAK: 




Figure 9.1 



SECTION 9.1 291 

(2) the labels on a line do not agree; in this case 
heuristics are invoked to settle the issue In favor of one or 
the other of the labels. 

As examples of these heuristics/ Guzman originally 
linked regions if either end of a line were marked with a 
linking label. Later he added a system using "weak" and 
"strong" links to allow more subtle weighting of 
possibilities and/ still later, he added a link inhibition 
feature which provided evidence against linking certain 
regions. Rattner (Rattner 1970) worked out various 
extensions to Guzman's work along these lines. 

As it turned out/ the link inhibition feature proved to 
be a much more powerful method than the previous methods he 
had tried. Basically this is because the link inhibition 
technique was less local than the previous links had been. 
The assignment of a link Inhibition between two regions has 
consequences for every line which separates the two regions/ 
unlike the linking mark which only serves as one piece of 
evidence in favor of linking two regions. In terms of my 
program/ the program using links only is very much like what 
my program would be If I divided my labels up Into those 
which had PLUS (convex) marks and all the rest (assume that 
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there are no shadows). The link inhibition labels would be 
those which have an arrow on the line segment, such as 
occluding edges, cracks, etc. The only strong evidence for 
linking regions comes from ARROW and FORK junctions, and of 
these the ARROW junctions are the more important, since 
(ignoring shadows and separable PIUS edges) every ARROW 
labeling links the two regions which bound the shaft of the 
ARROW. In contrast, there are a number of FORK junctions 
which have non-linking lines (see figure 9.2). 

But if link inhibitions are used there is considerable 
evidence in ARROW, T, X, and K junctions; In fact Freuder 
has shown that if only link inhibitions are used, the program 
works just about as well as Guzman's full program. 

There are numerous problems with Guzman's approach. 
First, his system simply does not work very well; for 
carefully chosen scenes it will find the correct results, but 
the program is very easy to fool. As Winston showed (Winston 
1968) Guzman's program fails badly on scenes with holes, and 
obviously the program Is worthless for scenes with shadows. 
If I map my label Ings into Guzman's binary scheme there are 
examples of virtually every possible labeling for each 
junction type within my data base. Thus It becomes obvious 
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that Guzman's labels are simply the most probable 
combinations of links for scenes without shadows. As such, 
his program really has very little understanding of the world 
(see Winston 1972a, 1972b). 

Second, Guzman's approach Is difficult to extend. This 
is due to the use of only one labeling for each junction and 
consequent heavy dependence on special purpose heuristics, 
and due also to the fact that virtually all the linking 
information for a line comes only from the two junctions at 
the ends of the line. There is no systematic way to use any 
information except locally. (The only exceptions are 
Guzman's use of matched Ts, the link inhibition Information, 
and regions which meet along more than one edge.) As an 
example, Orban's extension of Guzman's program to include 
shadows (Orban 1970) depends exclusively on the observation 
that shadows frequently have chained L and X junctions. But 
despite the fact that Orban's program does have a slightly 
greater understanding of the meaning that scene features can 
have, it is not a systematic extension. Like almost all the 
extensions suggested for Guzman's work, it is a patchwork 
method: to handle a new distinction, pick a few common 
features that display the distinction and then adjust the 
rest of the program to avoid making disastrous errors. 



SECTION 9.1 295 

Third, this approach leaves a great deal unexplained. 
Certainly there is a great deal more to understanding a scene 
than simply being able to connect the regions Into bodies. 

So far I have been dealing with the ways in which 
Guzman's approach was deficient/ but It has strong features 
as well. Guzman was the first person, to my knowledge, to 
get away from the Idea of storing descriptions of particular 
objects and trying to match these descriptions to a given 
scene. Roberts (Roberts 1963) had used this method and In 
fact others continued to do even less sophisticated template 
matching of sorts well after Guzman published his work. In 
contrast, Guzman's method works for arbitrary scenes 
containing trihedral vertices and gives some answer for any 
scene presented to it. Perhaps the most appealing feature of 
SEE was its simplicity and clarity; there are no 
tranformations, coordinates, or hidden lines, and In fact 
only topology Is used. Guzman's great insight was that by 
describing the physical characteristics of a relatively small 
number of local features, one can use simple decision 
procedures to derive much less local facts about arbitrary, 
unf ami 1 iar scenes. 
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Guzman's work was also instrumental in initiating two 
fruitful lines of research which are still active. This 
paper is along the line defined by Huffman and Clowes 
(Huffman 1971, Clowes 1971). The other line Is the work on 
heterarchy (For excellent discussions of both Guzman and 
heterarchy see Winston 1972a or 1972b, and Mtnsky & Papert 
1972). 

9.2 WORK AFTER GUZMAN; HUFFMAN & CLOWES 

Huffman was motivated partly by his observation of the 
lack of semantic content in Guzman's program to suggest a 
richer set of labels than link and do-not-link. (Whether 
Clowes came upon the same ideas independent of Guzman or not 
I do not know.) Clearly both were influenced by Guzman's 
"grammatical" approach to scene processing. Their great 
insight was that by describing edges more precisely one could 
use definite rules rather than probabilistically based 
heuristics to choose scene interpretations. Moreover they 
showed that one could even say with some assurance that 
certain line drawing could not even correspond to real 
physical scenes; compare this with the fact that Guzman's 
program rather blindly returns some decomposition into bodies 
for any line drawing, and you will get some idea of the 
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increase In understanding implicit In Huffman's and Clowes' 
work. 

Both Huffman and Clowes also worked with a construction 
for representing region orientations called the dual graph 
which influenced my thinking on region or ientat ions (Huffman 
1971, Clowes et al 1971). Unfortunately, there is no neat 
way that I could see to integrate the dual graph into a 
labeling scheme. In any case, I owe Huffman and Clowes a 
considerable debt. 

9.3 AN ACCOUNT OF MY EFFORTS 

When Dowson and I began working in this area, we 
envisioned a tree searching program which would attempt to 
assign label ings from a reasonably small set (like those of 
Huffman and Clowes) to a line drawing. Dowson came up with a 
set of junctions Involving cracks, and I generated a list of 
shadow junctions (Dowson & Waltz 1971). Dowson then 
developed VIRGIN, a tree search type labeling program (Dowson 
1971b) to apply this knowledge to real scenes. He 
immediately ran Into serious problems, since even the 
simplest scenes required huge amounts of computer space, and 
the program ended up with many possible label Ings for each 
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scene. Most of these label ings only differed by one or two 
line labels, but each of which took a considerable amount of 
time to produce. It did not become obvious to me until 
somewhat later that tree search was the wrong model for this 
problem. 

In my proposal for this work (Waltz 1971) I suggested a 
rather heterarch ical model for labeling line drawings. At 
this time I had already noted that by beginning with the 
scene/background boundary I could cut down the search space 
considerably, and I listed a number of rules (related to the 
selection rules and region illumination types) which I 
thought could further speed up and increase the power of a 
program. I also showed that region orientations could be 
handled easily if I restricted the universe of objects to 
include only those with right-angle edges. 

My major breakthrough came when I saw that the region 
orientations could be included as part of the edge labels, 
and then saw that I could also subdivide each edge type into 
several types according to the way that each edge could be 
decomposed. This Idea was first suggested to me by Freuder 
(see Freuder 1971a) nearly a year before I used It. 



SECTION 9.3 299 

The last pieces fell into place when I made the decision 
to try using a filtering program before doing a tree search / 
based on my observations of Dowson f s difficulties. Since the 
set of label ings I now had was far larger than the set which 
had clogged his program, I felt that I needed such a program 
to clear away the clutter of unneeded label ings and make tree 
searching feasible. I was genuinely surprised when the 
filtering program returned unique label ings for most of the 
junctions in the first scenes I gave to it. From here on my 
work followed directly from the success of the combination of 
this filtering program and the much enlarged junction 
labeling sets. I think It is noteworthy that this workts 
the direct result of my interaction with the program, as 
opposed to being the result of a system I worked out first by 
hand and only then implemented in a program. 

There is one lesson which I think is important, perhaps 
more important than any other in terms of the ways it might 
aid future research. For a long time after I had found the 
ways of describing region illuminations and edge 
decompositions, I tried to find a clever way to collapse the 
large set of line labels these distinctions Implied Into a 
smaller and more manageable set which would retain all the 
"essential" distinctions, whatever they were. Frustrated in 
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this attempt for quite a while, I finally decided to go ahead 
and include every possible labeling in the program, even 
though this promised to involve a good deal of typing. I 
hoped that when I ran the program certain regularities would 
appear, I.e. that when the program found a particular 
labeling for a junction it would always find another as well, 
so that the two label ings could be collapsed into one new one 
with no loss of information. Of course, as it turned out, it 
was the fact that I had made such precise distinctions that 
allowed the program to find unique label ings. The moral of 
this is that one should not be afraid of semi -i nf Ini t ies; a 
large number of simple facts may be needed to represent what 
can be deduced by computation using a few general Ideas. 

It also seems logical that, if anything, people are able 
to make much finer distinctions than I was considering, and 
that these distinctions had value for perception. For 
example, people can distinguish between obtuse or "blunt" 
edges (such as those of a regular dodecahedron), right angle 
edges (such as those of a cube), and acute or "sharp" edges 
(such as those of a regular tetrahedron). 
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Finally, I do not see any reason to suppose that we 
should be able to get along with distinctions on the order of 
one or two hundred, any more than a language program with a 
vocabulary of this size could comprehend or express anything 
very interesting. But by the same token, it may be that a 
vision system does not have to be too large for available 
computers in order to reach a point of diminishing returns, 
just as an increase in vocabulary beyond 10,000 words would 
probably not add much to a language program's (or a person's) 
abi 1 i ty. 
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